Index: trunk/lucene-search-3/src/site/apt/configure.apt |
— | — | @@ -4,13 +4,23 @@ |
5 | 5 | |
6 | 6 | Installation |
7 | 7 | |
| 8 | + TODO: instruction on installation here |
| 9 | + |
| 10 | + * Sharding |
| 11 | + |
| 12 | + how to split the index into shards (multiple machines) |
| 13 | + |
| 14 | + *Replication |
| 15 | + |
| 16 | + how to replicate the distributed search |
| 17 | + |
8 | 18 | * Indexing |
9 | 19 | |
10 | | - TODO: instruction on installation here |
| 20 | + TODO: building an initial index |
11 | 21 | |
12 | 22 | * Updates |
13 | 23 | |
14 | | - TODO: instruction on index update |
| 24 | + TODO: regular index update |
15 | 25 | |
16 | 26 | Search |
17 | 27 | |
— | — | @@ -28,14 +38,20 @@ |
29 | 39 | |
30 | 40 | Reporting |
31 | 41 | |
| 42 | + Statistics in special search page |
| 43 | + Document, Searches, Edit to Index Time. |
| 44 | + |
| 45 | +* Special Page |
| 46 | + |
32 | 47 | * JMX |
33 | 48 | |
34 | 49 | * Ganglia |
35 | 50 | |
| 51 | +* Search Analytics |
36 | 52 | |
37 | 53 | Administration |
38 | 54 | |
39 | | -* Document Limits |
| 55 | +* Document Length Limit |
40 | 56 | |
41 | 57 | TODO: limits on document size etc |
42 | 58 | |
Index: trunk/lucene-search-3/src/site/apt/development.apt |
— | — | @@ -2,14 +2,61 @@ |
3 | 3 | Development |
4 | 4 | ------------- |
5 | 5 | |
| 6 | +* Requirements |
| 7 | + |
| 8 | + Java Jdk version 1.6 |
| 9 | + Maven 2 |
| 10 | + |
| 11 | + |
| 12 | + To build you need to download the source from the SVN repository. |
| 13 | + |
| 14 | + |
6 | 15 | NLP Data Build |
7 | 16 | |
| 17 | + |
| 18 | +* Corpus |
| 19 | + |
| 20 | + TODO: describe how train a Max Entropy Model based sentence boundary chunker. |
| 21 | + |
| 22 | +* Preliminary Processing |
| 23 | + |
| 24 | + TODO: describe how convert a Wikipedia dump to a corpus |
| 25 | + setting chunker |
| 26 | + TODO: describe how convert a Wikitionary to compatible lexicon format |
| 27 | + |
8 | 28 | * Phonology |
9 | 29 | |
| 30 | + TODO: describe how to extend text to IPA conversion for "sound like" searching |
| 31 | + |
10 | 32 | * Morphology |
11 | 33 | |
| 34 | + TODO: describe how to generate a Monolingual word list |
| 35 | + TODO: describe how to generate a N-gram model from a Monolingual lexicon for language detection |
| 36 | + TODO: describe how to bootstrap a morphology |
| 37 | + TODO: describe how to induct a morphology from a Monolingual lexicon |
| 38 | + |
| 39 | + TODO: describe use the morphology for tagging a corpus |
| 40 | + TODO: describe use the morphology for tagging a corpus |
| 41 | + |
| 42 | + |
| 43 | +* Semantics |
| 44 | + |
| 45 | + TODO: describe how to build a lexical thesaurus. |
| 46 | + TODO: describe how to bootstrap a cross language thesaurus. |
| 47 | + TODO: describe how to induct a cross language thesaurus. |
| 48 | + |
| 49 | + TODO: describe how to import/localize an ontology/merology indexing data. |
| 50 | +* Cross Language |
| 51 | + |
| 52 | + TODO: describe how to get and update bi-lingual dictionaries from Apertium. |
| 53 | + TODO: describe how to specify (Personal) names |
| 54 | + TODO: describe how to override entries import |
| 55 | + TODO: describe how to build |
| 56 | + |
12 | 57 | * Named Entities |
13 | 58 | |
| 59 | + TODO: describe how train Named Entity detection model. |
| 60 | + |
14 | 61 | SOLR Build |
15 | 62 | |
16 | 63 | * Integration |
\ No newline at end of file |
Index: trunk/lucene-search-3/src/site/apt/usage.apt |
— | — | @@ -1 +1,3 @@ |
2 | | -Usage |
\ No newline at end of file |
| 2 | +----- |
| 3 | +Usage |
| 4 | +----- |
\ No newline at end of file |
Index: trunk/lucene-search-3/src/site/apt/goals.apt |
— | — | @@ -1 +1,21 @@ |
2 | | -Project Goals |
\ No newline at end of file |
| 2 | +------------- |
| 3 | +Project Goals |
| 4 | +------------- |
| 5 | + |
| 6 | + Mission: |
| 7 | + |
| 8 | + Build a Search engine that will provide excellent search for mediaWiki projects. |
| 9 | + |
| 10 | + * Goals |
| 11 | + |
| 12 | + Ease of installation, |
| 13 | + Distribution of large indexes, |
| 14 | + Replication for speed |
| 15 | + Short edit to search time |
| 16 | + Better language support |
| 17 | + Support for Wiki based meta data |
| 18 | + |
| 19 | + Crowdsource features - Localization etc |
| 20 | + Search Knowledge Repository - Allow client projects to share Ontology, NLP, Entity data in specific domains |
| 21 | + |
| 22 | + |
\ No newline at end of file |
Index: trunk/lucene-search-3/src/site/apt/faq.apt |
— | — | @@ -1 +1,3 @@ |
2 | | -Frquently Asked Questions |
\ No newline at end of file |
| 2 | +-------------------------- |
| 3 | +Frequently Asked Questions |
| 4 | +-------------------------- |
\ No newline at end of file |
Index: trunk/lucene-search-3/src/site/apt/index.apt |
— | — | @@ -1 +1,3 @@ |
2 | | -Index |
\ No newline at end of file |
| 2 | +----- |
| 3 | +Index |
| 4 | +----- |
\ No newline at end of file |