Index: trunk/lucene-search-3/src/site/apt/development.apt |
— | — | @@ -2,15 +2,13 @@ |
3 | 3 | Development |
4 | 4 | ------------- |
5 | 5 | |
6 | | -* Requirements |
| 6 | +Requirements |
7 | 7 | |
8 | 8 | Java Jdk version 1.6 |
9 | 9 | Maven 2 |
10 | 10 | |
11 | | - |
12 | 11 | To build you need to download the source from the SVN repository. |
13 | 12 | |
14 | | - |
15 | 13 | NLP Data Build |
16 | 14 | |
17 | 15 | |
— | — | @@ -20,42 +18,43 @@ |
21 | 19 | |
22 | 20 | * Preliminary Processing |
23 | 21 | |
24 | | - TODO: describe how convert a Wikipedia dump to a corpus |
| 22 | + TODO: describe how convert a Wikipedia dump to a corpus |
25 | 23 | setting chunker |
26 | | - TODO: describe how convert a Wikitionary to compatible lexicon format |
| 24 | + TODO: describe how convert a Wikitionary to compatible lexicon format |
27 | 25 | |
28 | 26 | * Phonology |
29 | 27 | |
30 | | - TODO: describe how to extend text to IPA conversion for "sound like" searching |
| 28 | + TODO: describe how to extend text to IPA conversion for "sound like" searching |
31 | 29 | |
32 | 30 | * Morphology |
33 | 31 | |
34 | | - TODO: describe how to generate a Monolingual word list |
35 | | - TODO: describe how to generate a N-gram model from a Monolingual lexicon for language detection |
36 | | - TODO: describe how to bootstrap a morphology |
37 | | - TODO: describe how to induct a morphology from a Monolingual lexicon |
| 32 | + TODO: describe how to generate a Monolingual word list |
| 33 | + TODO: describe how to generate a N-gram model from a Monolingual lexicon for language detection |
| 34 | + TODO: describe how to bootstrap a morphology |
| 35 | + TODO: describe how to induct a morphology from a Monolingual lexicon |
38 | 36 | |
39 | | - TODO: describe use the morphology for tagging a corpus |
40 | | - TODO: describe use the morphology for tagging a corpus |
| 37 | + TODO: describe use the morphology for tagging a corpus |
| 38 | + TODO: describe use the morphology for tagging a corpus |
41 | 39 | |
42 | 40 | |
43 | 41 | * Semantics |
44 | 42 | |
45 | | - TODO: describe how to build a lexical thesaurus. |
46 | | - TODO: describe how to bootstrap a cross language thesaurus. |
47 | | - TODO: describe how to induct a cross language thesaurus. |
| 43 | + TODO: describe how to build a lexical thesaurus. |
| 44 | + TODO: describe how to bootstrap a cross language thesaurus. |
| 45 | + TODO: describe how to induct a cross language thesaurus. |
48 | 46 | |
49 | | - TODO: describe how to import/localize an ontology/merology indexing data. |
| 47 | + TODO: describe how to import/localize an ontology/merology indexing data. |
| 48 | + |
50 | 49 | * Cross Language |
51 | 50 | |
52 | | - TODO: describe how to get and update bi-lingual dictionaries from Apertium. |
53 | | - TODO: describe how to specify (Personal) names |
54 | | - TODO: describe how to override entries import |
55 | | - TODO: describe how to build |
| 51 | + TODO: describe how to get and update bi-lingual dictionaries from Apertium. |
| 52 | + TODO: describe how to specify (Personal) names |
| 53 | + TODO: describe how to override entries import |
| 54 | + TODO: describe how to build |
56 | 55 | |
57 | 56 | * Named Entities |
58 | 57 | |
59 | | - TODO: describe how train Named Entity detection model. |
| 58 | + TODO: describe how train Named Entity detection model. |
60 | 59 | |
61 | 60 | SOLR Build |
62 | 61 | |