r110663 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r110662‎ | r110663 | r110664 >
Date:16:47, 3 February 2012
Author:oren
Status:deferred
Tags:
Comment:
Project documentation stubs
Modified paths:
  • /trunk/lucene-search-3/src/site/apt/configure.apt (modified) (history)
  • /trunk/lucene-search-3/src/site/apt/development.apt (modified) (history)
  • /trunk/lucene-search-3/src/site/apt/faq.apt (modified) (history)
  • /trunk/lucene-search-3/src/site/apt/goals.apt (modified) (history)
  • /trunk/lucene-search-3/src/site/apt/index.apt (modified) (history)
  • /trunk/lucene-search-3/src/site/apt/usage.apt (modified) (history)

Diff [purge]

Index: trunk/lucene-search-3/src/site/apt/configure.apt
@@ -4,13 +4,23 @@
55
66 Installation
77
 8+ TODO: instruction on installation here
 9+
 10+ * Sharding
 11+
 12+ how to split the index into shards (multiple machines)
 13+
 14+ *Replication
 15+
 16+ how to replicate the distributed search
 17+
818 * Indexing
919
10 - TODO: instruction on installation here
 20+ TODO: building an initial index
1121
1222 * Updates
1323
14 - TODO: instruction on index update
 24+ TODO: regular index update
1525
1626 Search
1727
@@ -28,14 +38,20 @@
2939
3040 Reporting
3141
 42+ Statistics in special search page
 43+ Document, Searches, Edit to Index Time.
 44+
 45+* Special Page
 46+
3247 * JMX
3348
3449 * Ganglia
3550
 51+* Search Analytics
3652
3753 Administration
3854
39 -* Document Limits
 55+* Document Length Limit
4056
4157 TODO: limits on document size etc
4258
Index: trunk/lucene-search-3/src/site/apt/development.apt
@@ -2,14 +2,61 @@
33 Development
44 -------------
55
 6+* Requirements
 7+
 8+ Java Jdk version 1.6
 9+ Maven 2
 10+
 11+
 12+ To build you need to download the source from the SVN repository.
 13+
 14+
615 NLP Data Build
716
 17+
 18+* Corpus
 19+
 20+ TODO: describe how train a Max Entropy Model based sentence boundary chunker.
 21+
 22+* Preliminary Processing
 23+
 24+ TODO: describe how convert a Wikipedia dump to a corpus
 25+ setting chunker
 26+ TODO: describe how convert a Wikitionary to compatible lexicon format
 27+
828 * Phonology
929
 30+ TODO: describe how to extend text to IPA conversion for "sound like" searching
 31+
1032 * Morphology
1133
 34+ TODO: describe how to generate a Monolingual word list
 35+ TODO: describe how to generate a N-gram model from a Monolingual lexicon for language detection
 36+ TODO: describe how to bootstrap a morphology
 37+ TODO: describe how to induct a morphology from a Monolingual lexicon
 38+
 39+ TODO: describe use the morphology for tagging a corpus
 40+ TODO: describe use the morphology for tagging a corpus
 41+
 42+
 43+* Semantics
 44+
 45+ TODO: describe how to build a lexical thesaurus.
 46+ TODO: describe how to bootstrap a cross language thesaurus.
 47+ TODO: describe how to induct a cross language thesaurus.
 48+
 49+ TODO: describe how to import/localize an ontology/merology indexing data.
 50+* Cross Language
 51+
 52+ TODO: describe how to get and update bi-lingual dictionaries from Apertium.
 53+ TODO: describe how to specify (Personal) names
 54+ TODO: describe how to override entries import
 55+ TODO: describe how to build
 56+
1257 * Named Entities
1358
 59+ TODO: describe how train Named Entity detection model.
 60+
1461 SOLR Build
1562
1663 * Integration
\ No newline at end of file
Index: trunk/lucene-search-3/src/site/apt/usage.apt
@@ -1 +1,3 @@
2 -Usage
\ No newline at end of file
 2+-----
 3+Usage
 4+-----
\ No newline at end of file
Index: trunk/lucene-search-3/src/site/apt/goals.apt
@@ -1 +1,21 @@
2 -Project Goals
\ No newline at end of file
 2+-------------
 3+Project Goals
 4+-------------
 5+
 6+ Mission:
 7+
 8+ Build a Search engine that will provide excellent search for mediaWiki projects.
 9+
 10+ * Goals
 11+
 12+ Ease of installation,
 13+ Distribution of large indexes,
 14+ Replication for speed
 15+ Short edit to search time
 16+ Better language support
 17+ Support for Wiki based meta data
 18+
 19+ Crowdsource features - Localization etc
 20+ Search Knowledge Repository - Allow client projects to share Ontology, NLP, Entity data in specific domains
 21+
 22+
\ No newline at end of file
Index: trunk/lucene-search-3/src/site/apt/faq.apt
@@ -1 +1,3 @@
2 -Frquently Asked Questions
\ No newline at end of file
 2+--------------------------
 3+Frequently Asked Questions
 4+--------------------------
\ No newline at end of file
Index: trunk/lucene-search-3/src/site/apt/index.apt
@@ -1 +1,3 @@
2 -Index
\ No newline at end of file
 2+-----
 3+Index
 4+-----
\ No newline at end of file

Status & tagging log