Tài liệu miễn phí Báo cáo khoa học

Download Tài liệu học tập miễn phí Báo cáo khoa học

Báo cáo khoa học: A COMMON FRAMEWORK FOR ANALYSIS AND GENERATION

It seems highly desirable to use a single representation of linguistic knowledge for both analysis and generation. We argue that the only part of the average NL system's knowledge that we can have any faith in is its vocabulary and, to a lesser extent, its syntactic rules, and we investigate the consequences of this for generation. 1 ANALYSIS Consider a typical NLU system. You give it a piece of text, say: (1) The house I live in is damp. It grinds away, trying out syntactic rules until it has an analysis of the structure of the text. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Sixth Conference of the European Chapter of the Association for Computational Linguistics

This volume contains the papers prepared for the Sixth Conference of the European Chapter of the Association for Computational Linguistics, held 19-23 April 1993 in Utrecht. The Programme Committee received a large number of submissions (5 page extended abstracts) from all over the world. The general quality of the submissions was high. Out of a total of 229 submissions, 47 were accepted, including 7 reserve papers. Every abstract submitted was reviewed by one member of the Programme Committee and three referees (see pages v and vi). ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: The Incremental Generation of Passive Sentences

This paper sketches some basic features of the SYNPHONICS account of the computational modelling of incremental language production with the example of the generation of passive sentences. The SYNPHONICS approach aims at linking psycholinguistic insights into the nature of the human natural language production process with well-established assumptions in theoretical and computational linguistics concerning the representation and processing of grammatical knowledge.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Experiments in Reusability of Grammatical Resources

Substantial formal grammatical and lexical resources exist in various NLP systems and in the form of textbook specifications. In the present paper we report on experimental results obtained in manual, semi-antomatic and automatic migration of entire computational or textbook descriptions (as opposed to a more informal reuse of ideas or the design of a single polytheoretic representation) from a variety of formalisms into the ALEP formalism. 1 The choice of ALEP (a comparatively lean, typed feature structure formalism based on rewrite rules) was motivated by the assumption that the study would be most interesting if the target formalism is...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Talking About Trees

In this paper we introduce a modal language L T for imposing constraints on trees, and an extension LT(L r) for imposing constraints on trees decorated with feature structures. The motivation for introducing these languages is to provide tools for formalising grammatical frameworks perspicuously, and the paper illustrates this by showing how the leading ideas of GPS6 can be captured in LT(LF). In addition, the role of modal languages (and in particular, what we have called layered modal languages) as constraint formalisms for linguistic theorising is discussed in some detail. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Decidability and Undecidability in stand-alone Feature Logics

This paper investigates the complexity of the satisfiability problem for feature logics strong enough to code entire grammars unaided. We show that feature logics capable of both enforcing re-entrancy and stating linguistic generalisations will have undecidable satisfiability problems even when most Boolean expressivity has been discarded. We exhibit a decidable fragment, but the restrictions imposed to ensure decidability render it unfit for stand-alone use. The import of these results is discussed, and we conclude that there is a need for feature logics that are less homogeneous in their treatment of linguistic structure. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Using an Annotated Corpus as a Stochastic Grammar

In Data Oriented Parsing (DOP), an annotated corpus is used as a stochastic grammar. An input string is parsed by combining subtrees from the corpus. As a consequence, one parse tree can usually be generated by several derivations that involve different subtrces. This leads to a statistics where the probability of a parse is equal to the sum of the probabilities of all its derivations. In (Scha, 1990) an informal introduction to DOP is given, while (Bed, 1992a) provides a formalization of the theory. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Data-Oriented Methods for Grapheme-to-Phoneme Conversion

It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques, based on a corpus of transcribed words, the same and even better performance can be achieved, without explicit modeling of linguistic knowledge. In this paper we present two instances of this approach.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Disjunctions and Inheritance in the Context Feature Structure System

Substantial efforts have been made in order to cope with disjunctions in constraint based grammar formalisms (e.g. [Kasper, 1987; Maxwell and Kaplan, 1991; DSrre and Eisele, 1990].). This paper describes the roles of disjunctions and inheritance in the use of feature structures and their formal semantics. With the notion of contexts we abstract from the graph structure of feature structures and properly define the search space of alternatives. The graph unification algorithm precomputes nogood combinations, and a specialized search procedure which we propose here uses them as a controlling factor in order to delay decisions as long as there...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: A Strategy for Dynamic Interpretation: a Fragment and an Implementation

The strategy for natural language interpretation presented in this paper implements the dynamics of context change by translating natural language texts into a meaning representation language consisting of (descriptions of) programs, in the spirit of dynamic predicate logic (DPL) [5]. The difference with DPL is that the usual DPL semantics is replaced by an error state semantics [2]. This allows for the treatment of unbound anaphors, as in DPL, but also of presuppositions and presupposition projection. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Head-driven Parsing for Lexicalist Grammars: Experimental Results

We present evidence that head-driven parsing strategies lead to efficiency gains over standard parsing strategies, for lexicalist, concatenative and unification-based grammars. A head-driven parser applies a rule only after a phrase matching the head has been derived. By instantiating the head of the rule important information is obtained about the left-hand-side and the other elements of the right-hand-side. We have used two different head-driven parsers and a number of standard parsers to parse with lexicalist grammars for English and for Dutch. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: An Endogeneous Corpus-Based Method for Structural Noun Phrase Disambiguation

In this paper, we describe a method for structural noun phrase disambiguation which mainly relies on the examination of the text corpus under analysis and doesn't need to integrate any domain-dependent lexico- or syntactico-semantic information. This method is implemented in the Terminology Extraction Sotware LEXTER. We first explain why the integration of LEXTER in the LEXTER-K project, which aims at building a tool for knowledge extraction from large technical text corpora, requires improving the quality of the terminolgy extracted by LEXTER. Then we briefly describe the way LEXTER works and show what kind of disambiguation it has to perform...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Morphonology in the Lexicon

In this paper we present a means of defining morphonological phenomena in an inheritance based lexicon. We make use of the theory behind the formal language MOLUSC, in which morphological alternations were defined as mappings between sequences of tree-structured syllables. We discuss how the alternations can be defined in the inheritance-based lexical representation language DATR, and how the phonological aspects can be built upon to bring it closer to an integrated lexicon with representations which can be used by both the morphology and phonology of a language. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: LFG Semantics via Constraints

Semantic theories of natural language associate meanings with utterances by providing meanings for lexical items and rules for determining the meaning of larger units given the meanings of their parts. Traditionally, meanings are combined via function composition, which works well when constituent structure trees are used to guide semantic composition. More recently, the functional structure of LFG has been used to provide the syntactic information necessary for constraining derivations of meaning in a cross-linguistically uniform format. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: On the notion of uniqueness

In the paper it is argued that for some linguistic phenomena, current discourse representation structures are insufficiently finegrained, both from the perspective of serving as representation in NLP and from a truth conditional perspective. One such semantic phenomenon is uniqueness. It is demonstrated that certain elements are forced to have a unique interpretation, from a certain point in discourse onwards. This could be viewed as the semantic counterpart of surface order. Although it has always been acknowledged that the left-toright order of constituents influences the meaning of an utterance, it is, for example, not reflected in standard Discourse Representation...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Automating the Acquisition of Bilingual Terminology

As the acquisition problem of bilingual lists of terminological expressions is formidable, it is worthwhile to investigate methods to compile such lists as automatically as possible. In this paper we discuss experimental results for a number of methods, which operate on corpora of previously translated texts. K e y w o r d s : parallel corpora, tagging, terminology acquisition.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Parsing with polymorphism

Certain phenomena resist coverage within the Lambek Calculus, such as scopeambiguity and non-peripheral extraction. I have argued in previous work that an extension called Polymorphic Lambek Calculus (PLC), which adds variables and their universal quantification, covers these phenomena. However, a major problem is the absence of a known decision procedure for PLC grammars. This paper proposes a decision procedure which covers a subset of all the possible PLC grammars, a subset which, however, includes the PLC grammars with wide coverage. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: The donkey strikes back Extending the dynamic interpretation constructively

The dynamic interpretation of a formula as a binary relation (inducing transitions) on states is extended by alternative treatments of implication, universal quantification, negation and disjunction that are more dynamic (in a precise sense) than the usual reductions to tests from quantified dynamic logic (which, nonetheless, can be recovered from the new connectives). An analysis of the donkey sentence followed by the assertion It will kick back is provided.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Aunification-based approach to multiple VP Ellipsis resolution

An assumption shared by many theories of discourse is that discourse structure constrains anaphora resolution (cf. [Grosz and Sidner 1986] for definite NPs, [Lascarides and Asher 1991], [Nakhimovsky 1988] for temporal anaphora, [Webber 1990] for deictic pronouns and [Gardent 1991], [Prfist and Scha 1990] for VP ellipsis). The aim of this paper is (i) to show that this assumption also applies to multiple VP ellipsis (VPE), (ii) to argue that other levels of linguistic information (such as syntax and semantics) interact with discourse structure in determining multiple VPE acceptability and (iii)to make these intuitions precise by providing a unification-based...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Rule-based Acquisition and Maintenance Lexical and Semantic Knowledge

The lexicons for Knowledge-Based Machine Translation systems require knowledge intensive morphological, syntactic and semantic information. This information is often used in different ways and usually formatted for a specific NLP system. This tends to make both the acquisition and maintenance of lexical databases cumbersome, inefficient and error-prone. In order to solve these problems, we have developed a program called COOL which automates the acquisition and maintenance processes and allows us to standardize and centralize the databases. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: A Computational Treatment of Sentence

We describe a computational system which parses discourses consisting of sequences of simple sentences. These contain a range of temporal constructions, including time adverbials, progressive aspect and various aspectual classes. In particular, the grammar generates the required readings, according to the theoretical analysis of (Glasbey, forthcoming), for sentence-final 'then'.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Towards a proper treatment of coercion phenomena

The interpretation of coercion constructions (to begin a book) has been recently considered as resulting from the operation of type changing. For instance, a phrase of type o (object) is coerced to a phrase of type e (event) under the influence of the predicate. We show that this procedure encounters empirical difficulties. Focussing on the begin/commencer case, we show that the coercion interpretation results both from general semantic processes and properties of the predicate, and we argue that it is best represented at the lexical level. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Identifying Topic and Focus by an Automatic Procedure

An algorithm for automatic identification of topic and focus of the sentence is presented, based on dependency syntax and using written input, which is much more ambiguous than spoken utterance.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: A Probabilistic Context-free Grammar for Disambiguation in Morphological Parsing

One of the major problems one is faced with when decomposing words into their constituent parts is ambiguity: the generation of multiple analyses for one input word, many of which are implausible. In order to deal with ambiguity, the MORphological PArser MORPA is provided with a probabilistic context-free grammar (PCFG), i.e. it combines a conventional context-free morphological grammar to filter out ungrammatical segmentations with a probability-based scoring function which determines the likelihood of each successful parse. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Restriction and Correspondence-based Translation

Kaplan et al. (1989) present a framework for translation based on the description and correspondence concepts of LexicalFunctional Grammar (Kaplan and Bresnan, 1982). Certain phenomena, in particular the head-switching of adverbs and verbs, seem to be problematic for that approach. In this paper we suggest that these difficulties are more properly considered as the result of defective monolingual analyses. We propose a new description-language operator, restriction, to permit a succinct formal encoding of the informal intuition that semantic units sometimes correspond to subsets of functional information. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: A Discourse Copying Algorithm for Ellipsis and Anaphora Resolution

We give an analysis of ellipsis resolution in terms of a straightforward discourse copying algorithm that correctly predicts a wide range of phenomena. The treatment does not suffer from problems inherent in identity-of-relations analyses. Furthermore, in contrast to the approach of Dalrymple et al. [1991], the treatment directly encodes the intuitive distinction between full NPs and the referential elements that corefer with them through what we term role linking.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Inheriting Verb Alternations

The paper shows how the verbal lexicon can be formalised in a way that captures and exploits generalisations about the alternation behaviour of verb classes. An alternation is a pattern in which a number of words share the same relationship between • a pair of senses. The alternations captured are ones where the different senses specify different relationships between syntactic complements and semantic arguments, as between bake in John is baking the cake and The cake is baking. The formal language used is DATR. ...

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Linguistic Knowledge Acquisition from Parsing Failures

A semi-automatic procedure of linguistic knowledge acquisition is proposed, which combines corpus-based techniques with the conventional rule-based approach. The rule-based component generates all the possible hypotheses of defects which the existing linguistic knowledge might contain, when it fails to parse a sentence. The rule-based component does not try to identify the defects, but generates a set of hypotheses and the corpus-based component chooses the plausible ones among them.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Similarity between Words Computed by Spreading Activation on an English Dictionary

This paper proposes a method for measuring semantic similarity between words as a new tool for text analysis. The similarity is measured on a semantic network constructed systematically from a subset of the English dictionary, LDOCE (Long-man Dictionary of Contemporary English). Spreading activation on the network can directly compute the similarity between any two words in the Longman Defining Vocabulary, and indirectly the similarity of all the other words in LDOCE.

8/30/2018 3:08:10 AM +00:00

Báo cáo khoa học: Mathematical Aspects of Command Relations

In GB, the importance of phrase-structure rules has dwindled in favour of nearness conditions. Today, nearness conditions play a major role in defining the correct linguistic representations. They are expressed in terms of special binary relations on trees called command relations. Yet, while the formal theory of phrase-structure grammars is quite advanced, no formal investigation into the properties of command relations has been done. We will try to close this gap. In particular, we will study the intrinsic properties of command relations as relations on trees as well as the possibility to reduce nearness conditions expressed by command...

8/30/2018 3:08:10 AM +00:00