Xem mẫu
- B CMS 250
Co-ordination Factor. See coord
batchSize 78 collapse.facet, field collapsing 192
bf parameter 117 collapse.field, field collapsing 192
Blacklight Online Public Access Catalog. collapse.info.doc, field collapsing 193
See Blacklight OPAC, Ruby On Rails collapse.maxdocs, field collapsing 193
integrations collapse.threshold, field collapsing 193
Blacklight OPAC, Ruby On Rails collapse.type, field collapsing 192
integrations combined index 32
about 263 CommonsHttpSolrServer 235
data, indexing 263-267 complex systems, tuning
Boolean operators about 271
AND 100 CPU usage 272
AND operator, combining with OR memory usage 272
operator 101 scale deep 273
AND or && operator 101 scale high 273
NOT 100 scale wide 273
NOT operator 101 system changes 272
OR 100 components
OR or || operator 101 about 111, 159
bool element 92 solrconfig.xml 159
boost functions compressed, field option 41
boosting 137, 138 configuration files, Solr
r_event_date_earliest field 138 tag 25
boosting 70, 107 solrconfig.xml file 25
boost queries standard request handler 26
boosting 134-137 Configuration Management. See CM
bq parameter(s) 134 ConsoleHandler 204
bucketFirstLetter 148 Content Construction Kit 252
buildOnCommit 174 Content Management System. See CMS
buildOnCommit, spellchecker option 174 Continuous Integration. See CI
buildOnOptimize, spellchecker option 174 coord 112
copyField directive
C about 46
uses 46
caches
CoreDescriptor classes 231
tuning 281
core, managing 209, 210
CapitalizationFilterFactory filter 63
count, Stats component 189
CCK 252
CPU usage 272
Chainsaw
cron 289
URL 204
CSV, sending to Solr
characterEncoding, FileBasedSpellChecker
about 72
option 175
configuration options 73, 74
CharFilterFactory 62
curl
CI 128
using, to interact with Solr 66, 68
classname 173
CM 197
[ 302 ]
- D dataSource attribute 78
development console 76, 77
data, indexing documents, entities 78
stream.body parameter 67 entity 78
stream.file parameter 67 getting started 75
stream.url parameter 67 mb-dih-artists-jdbc.xml file 75, 76
through HTTP POST 67 query attribute 78
ways 67 reference document, URL 74
database Solr, registering with 75
and Lucene search index, differences 9, 10 solrconfig.xml 75
DataImportHandler. See DIH DIH, development console
dataSource attribute 78 DataSources, JdbcDataSource type 77, 78
date element 93 DIH control form 77
date facet, parameters documents, entities 79
facet.date 151 fields 79
facet.date.end 151 importing with 80
facet.date.gap 151 DIH, transformers
facet.date.hardend 151 dateTimeFormat attributes 79
facet.date.other 152 splitBy attributes 79
facet.date.start 151 template attributes 79
dates, Faceting 146 DIH fields
debugQuery, diagnostic parameter column attribute 79
about 98 name attribute 79
explainOther 98 directory structure, Solr
defaults 111 build 13
defaultSearchField, schema.xml settings 47 client 13
defType, query parameter 95 dist 13
defType parameter 128 example 14
deleteById() 232 example/etc 14
deleteByQuery() 232 example/multicore 14
denormalizing example/solr 14
one to many associated data 36, 37 example/webapps 14
one to one associated data 36 lib 14
deployment process, Solr 197, 198 site 14
df, query parameter 95 src 14
diagnostic query parameters src/java 14
debugQuery 98 src/scripts 14
echoHandler 98 src/solrj 14
echoParams 98 src/test 14
indent 98 src/webapp 14
dictionary Disjunction-Max. See dismax
about 169 DisjunctionMaxQuery
building, from source 176, 177 about 130
DIH boosts, configuring 131
about 74, 236 queried fields, configuring 131
capabilities 74 dismax 113
[ 303 ]
- dismax handler. See Dismax Solr request EdgeNGram analyzer 61
handler EdgeNGramFilterFactory 61
dismax query handler 131 EdgeNGramTokenizerFactory 61
dismax request handler 128 Elasticfox 276
Dismax Solr request handler Embedded-Solr 65
about 128 embedded Solr
automatic phrase boosting 132, 133 legacy Lucene, upgrading from 237
boost functions, boosting 137, 138 using for rich clients 237
boost queries, boosting 134-137 using in in-process streaming 236, 237
debugQuery option used 129 EmbeddedSolrServer class 224
default search 140, 141 encoder attribute 59
DisjunctionMaxQuery 130 EnglishPorterFilter Factory, stemming 54
features, over standard handler 129 Entity tags 279
limited query syntax 131 ETag 279
min-should-match 138 ETL 78
mm query parameter 138 eval() function 238
phrase slop, configuring 134 existence (and non-existence) queries 107
distanceMeasure, spellchecker option 174 explicit mapping 56
distributed search 32 Extract Transform and Load. See ETL
div(x,y), mathematical primitives 121 extraParams entry 242
doc element 93
docText field data 233 F
document
deleting 70 facet 146
documentCache 281 facet.date 151, 286
Domain Specific Language. See DSL examples 151
double element 92 facet.date.end 151
DoubleMetaphone, phonetic encoding facet.date.gap 151
algorithms 58 facet.date.hardend 151
DoubleMetaphoneFilterFactory analysis facet.date.other 152
filter, options facet.date.start 151
inject 59 facet.field 147
maxCodeLength 59 facet.limit 147
Drupal, options facet.method 148
Apache Solr Search integration module 251 facet.mincount 147
Solr, hosted by Acquia 252 facet.missing 148
DSL 269 facet.missing parameter 143
dynamic fields facet.offset 147
* fallback 46 facet.prefix 148, 156
about 45 facet.query 286
facet.query parameter 152, 153
E facet.sort 147
facet_counts 143
echoHandler, diagnostic parameter 98 faceted navigation 7, 141, 145, 153
echoParams 152 faceted search 149, 220, 221
echoParams, diagnostic parameter 98
[ 304 ]
- faceting collapse.info.count 193
about 141 collapse.info.doc 193
alphabetic range bucketing (A-C, D-F, and collapse.maxdocs 193
so on) 148, 149 collapse.threshold 193
date facet parameters 151, 152 collapse.type 192
dates 146, 149, 150 configuring 192, 193
example 142, 143 SOLR-236 191
facet.field 147 field definitons, schema.xml file
facet.limit 147 attributes 42
facet.method 148 copyField, using 46
facet.mincount 147 copyField directive, using 46
facet.missing 148 default (optional) 42
facet.missing parameter 143 dynamic fields 45
facet.offset 147 name 42
facet.prefix 148 required (optional) 42
facet.sort 147 schema.xml, settings 47
facet_counts 143 sorting 44
facet prefixing (term suggest) 156-158 sorting, limitations 44, 45
field, requisites 146 type 42
field values (text) 146 field length. See fieldNorm
filters, excluding 153-155 field list. See fl
Local Params 155 fieldNorm 112
on arbitrary parameters 152, 153 field options, schema.xml file
queries 146 compresses 41
release types, exampleexample 142, 143 indexed 41
schema changes, MusicBrainz example 144, multiValued 41
145 omitNorms (advanced) 41
text 147 positionIncrementGap (advanced) 42
types 146 sortMissingFirst 41
faceting, dates sortMissingLast 41
about 149 stored 41
examples 150 termVectors (advanced) 41
Facet prefixing 156 field qualifier 102, 103
Familiarity field references, function queries 120
URL 204 fieldType, spellchecker option 174
FastLRUCache 280 field types, schema.xml file
fetchSize 78 tag 40
field, attributes tag 40
default (optional) 42 class attribute 40
name 42 field values (text), Faceting 146
required (optional) 42 file, spellchecker 172
type 42 FileBasedSpellChecker options
field, IndexBasedSpellChecker option 174 characterEncoding 175
field collapsing, search components sourceLocation 175
about 191, 192 FileHandler logging 204
collapse.facet 192 filterCache 280
collapse.field 192 filter element 50
[ 305 ]
- filtering 108, 109 H
filters, Faceting
excluding 153, 155 Hadoop 225
first-components 111 HathiTrust 273
fl 220 Heritrix
fl, output related parameter 96 using, to download artist pages 226, 227
float element 92 highlighted field list. See hl.fl
fq, query parameter 95 highlighting component, search
function argument components
limitations 120 about 161
function queries configuring 163
_val_ pseudo-field hack 117 example 161, 163
about 117 hl 164
bf parameter 117 hl.fl 164
Daydreaming search example 119 hl.fragsize 164
example 118 hl.highlightMultiTerm 164
field references 120 hl.mergeContiguous 165
function references 120 hl.requireFieldMatch 164
incorporating, to searches 117 hl.snippets 164
t_trm_lookups 118 hl.usePhraseHighlighter 164
function query, tips 128 hl alternateField 165
function references hl formatter 165
mathematical primitives 121 hl fragmenter 165
function references, function queries 120 hl maxAnalyzedChars 165
parameters 164
G hl, highlighting component 164
hl.fl 161
g, query parameter 95 hl.fl, highlighting component 164
g.op, query parameter 95 hl.fragsize, highlighting component 164
generic XML data structure hl.highlightMultiTerm, highlighting
about 92 component 164
appends 111 hl.increment, regex fragmenter 166
arr, XML element 92 hl.mergeContiguous, highlighting
bool element 92 component 165
components 111 hl.regex.maxAnalyzedChars, regex
date element 93 fragmenter 166
defaults 111 hl.regex.pattern, regex fragmenter 166
double element 92 hl.regex.slop, regex fragmenter 166
first-components 111 hl.requireFieldMatch, highlighting
float element 92 component 164
int element 92 hl.snippets, highlighting component 164
invariants 111 hl.usePhraseHighlighter, highlighting
last-components 111 component 164
long element 92 hl alternateField, highlighting component
lst, XML element 92 165
str element 92 hl formatter, highlighting component
Git about 165
URL 11 hl.simple.pre and hl.simple.post 165
[ 306 ]
- hl fragmenter, highlighting component 165 factors, committing 285
hl maxAlternateFieldLength, highlighting factors, optimizing 285
component 165 unique document checking, disabling 285
hl maxAnalyzedChars, highlighting Index Searchers 280
component 165 Information Retrieval. See IR
home directory, Solr int element 92
bin 15 InternetArchive 226
conf 15 invariants 111
conf/schema.xml 15 Inverse Document Frequency. See IDF
conf/solrconfig.xml 15 inverse reciprocals 125
conf/xslt 15 IR 8
data 15 ISOLatin1AccentFilterFactory filter 62
lib 15 issue tracker, Solr 27
HTML, indexing in Solr 227
HTMLStripStandardTokenizerFactory 52 J
HTMLStripStandardTokenizerFactory
tokenizer 227 J2SE
HTMLStripWhitespaceTokenizerFactory 52 with JConsole 212
HTTP caching 277-279 JARmageddon 205
HTTP server request access logs, logging jarowinkler, spellchecker 172
about 201, 202 java.util.logging package 203
log directory, creating 201 Java class names
Tailing 202 abbreviated 40
org.apache.solr.schema.BoolField 40
I Java Development Kit (JDK)
URL 11
IDF 33 JavaDoc tags 234
idf 112 Java Management Extensions. See JMX
ID field 44 Java Naming and Directory Interface. See
indent, diagnostic parameter 98 JNDI
index 31 Java replication
index-time versus script 289
and query-time, boosting 113 JavaScript Object Notation. See JSON
versus query-time 57 Java Server Pages. See JSPs
index-time boosting 70 JConsole GUI
IndexBasedSpellChecker options about 212
field 174 URL 212
sourceLocation 174 JDK [1.4] logging 203
thresholdTokenFrequency 175 JDK logging 203
index data Jetty
document access, controlling 221 startup integration 205
securing 220 web.xml, customizing 218
indexed, field option 41 jetty.xml 201
indexed, schema design 282 JIRB tool 215
indexes JMX
sharding 295 about 212
indexing strategies access, controlling 220
about 283
[ 307 ]
- information extracting, JRuby used 215 HTTP server request access logs 201, 202
Solr, starting with 212-215 levels. managing at runtime 205, 206
Jmx4r 217 Solr application logging 203
JMX Console 212 types 201
JNDI 16, 200 logging.properties file 204
JNDI name 200 long element 92
jQuery 240 LowerCaseFilterFactory filter 62
jQuery Autocomplete widget 241, 242 LRUCache 280
JRuby lst, XML element 92
using, to extract JMS information 215 Lucene
JRuby Interactive Browser tool. See JIRB about 8
tool DisjunctionMaxQuery 130
JSON 238 features 8
JSONP 242 scoring 112
JSON with Padding. See JSONP Lucene’s query syntax
JSPs 17 URL 44
JUL 203 LUCENE-1435 45
JVM Lucene search index
configuration 277 and database, differences 9, 10
Lucene syntax
K query expression 100
query syntax 99
KeepWordFilterFactory filter 62 sub-expressions 101
KeywordTokenizerFactory 52
KStem, stemming 55 M
L mailing lists, Solr
URL 26
last-components 111 Managed Bean. See MBeans
LengthFilterFactory 145 mandatory clause, expression query 100
LengthFilterFactory filter 62 map() function 243
LetterTokenizerFactory 52 map(x,min,max,target), miscellaneous math
limited query syntax 131 121
disabling 132 master server
linear(x,m,c), miscellaneous math 122 indexing into 292
Local Params 155 mathematical primitives, function
LocalSolr component 194 references
log(x), mathematical primitives 121 abs(x) 121
Log4j div(x,y) 121
configuring, URL 205 log(x) 121
logging to 204 pow(x,y) 121
Log4j JAR file product(x,y,z,...) 121
URL 204 sqrt(x) 121
logarithms 123, 124 sum(x,y,z, ... ) 121
Logback Maven 228
URL 204 max(x,c), miscellaneous math 121
logging max, Stats component 189
about 201 maxGramSize 60
[ 308 ]
- maxScore 93 specific parameters 183
maxWarmingSearchers 284 using, ways 182
mb-dih-artists-jdbc.xml file 75, 76 mlt.boost 186
mb_attributes.txt mlt.fl 185
content 145 mlt.maxntp 186
MBeans 212 mlt.maxqt 186
mean, Stats component 189 mlt.maxwl 185
member_id field 36 mlt.mindf 185
memory usage 272 mlt.mintf 185
Metaphone, phonetic encoding algorithms mlt.minwl 185
58 mlt.qf 185
min, Stats component 189 mm query parameter 138
min-should-match mm specification formats
about 138 as examples 139
basic rules 139 more-like-this search component. See MLT,
multiple rules 139 search components
rules 139 more like this plugin 9
rules, choosing 140 multi-word synonyms 56
minGramSize 60 multicore
miscellaneous math, function references need for 210, 211
linear(x,m,c) 122 multiple indices 32
map(x,min,max,target) 121 multiple Solr servers
max(x,c) 121 documents, assigning to shards 296
recip(x,m,a,c) 122 indexes, sharding 295
scale(x,minTarget,maxTarget) 121 master server, indexing into 292
missing, Stats component 189 replication, configuring 291
MLT, search components script versus Java replication 289
as dedicated request handler 182 searches, distributing 291
as request handler, with external input search queries, distributing across slaves
document 183 293, 294
as Solr component 182 shards, searching across 297, 298
configuration parameters 183 slaves, configuring 292, 293
mlt 183 starting 290, 291
mlt.boost 186 multiValued, field option 41
mlt.count 183 multiValued field 221
mlt.fl 185 MusicBrainz.org 30, 31
mlt.maxntp 186
mlt.maxqt 186 N
mlt.maxwl 185
mlt.mindf 185 n-gramming costs
mlt.mintf 185 Edge n-gramming costs 62
mlt.minwl 185 tokenizer based n-gramming costs 62
mlt.qf 185 N-gramming costs, substring indexing
parameters 185, 186 a_name field 61
parameters, specific to MLT request handler a_name field + a_ngram field 61
184 minGramSize 62
results, example 186, 188 name 173
name attribute 143
[ 309 ]
- name field 33 Metaphone 58
newSearch query 284 RefinedSoundex 58
NOT operator 100, 101 Soundex 58
numFound 93 PhoneticFilterFactory filter 59
Nutch 225 phonetic sounds-like
Nutch + Web Archive eXtensions. See about 58
NutchWAX phonetic encoding algorithms 58
NutchWAX 225 phrase queries 103
phrase search performance
O improving 287
shingling, solution 287, 288
OLTP 78 phrase slop
omitNorms (advanced), field option 41 configuring 134
omitNorms, schema design 282 Plain Old Java Objects. See POJOs
omitTermFreqAndPositions, schema design POJOs
282 indexing 234
Online Transaction Processing systems. See PorterStemFilterFactory, stemming 54
OLTP positionIncrementGap (advanced), field
optional clause, expression query 100 option 42
ord() function 120, 122 pow(x,y), mathematical primitives 121
ord(fieldReference) 122 product(x,y,z, ... ), mathematical primitives
ord/rord 122 121
ord and rord, function references prohibited clause, expression query 100
ord(fieldReference) 122 PRONOM Unique Identifier. See PUID
rord(fieldReference) 122 public searches
OR operator 100 securing 219, 220
OR or || operator 101 PUID 31
output related parameters, query parameters
fl 96 Q
sort 96
version 98 q parameter
wt 97 processing 175
outputUnigrams controls 288 qt, miscellaneous parameter 95
QTime 93
P queries, Faceting 146
query-time
parse and index-time, boosting 113
parameter 243 versus index-time 57
parse() function 244 query-time boosting 70
partial indexing. See substring indexing query attribute 78
PatternReplaceFilterFactory filter 63 query converter 175
PatternTokenizerFactory 53 query elevation, search components
pf, tips 134 about 166
pf parameter 133 config-file 167, 168
phoneme 58 configuration parameters 167
phonetic encoding algorithms configuring 167
DoubleMetaphone 58 elevateArtists.xml 168
encoder attribute 59
[ 310 ]
- forceElevation 168 { and } brackets 106
queryFieldType 168 about 105, 106
query expression, clauses date math 106, 107
mandatory clause 100 readOnly 77
optional clause 100 recip(x,m,a,c), miscellaneous math 122
prohobited clause 100 reciprocals and rord, with dates 126, 127
query parameters RecordItem 234
about 95 RefinedSoundex, phonetic encoding
defType 95 algorithms 58
df 95 regex fragmenter, options
diagnostic 98 hl.increment 166
fq 95 hl.regex.pattern 166
output related parameters 96 hl.regex.slop 166
q 95 hl regex.maxAnalyzedChars 166
q.op 95 release’s artist’s name. See r_a_name
qt 95 remote streaming
result paging 96 about 68, 221
rows 96 disabling 69
start 96 enabling 69
query parser plugin 128 remote streaming feature 224
QueryResponse object 235 RemoveDuplicatesTokenFilterFactory filter
queryResultCache 280 62
query spell checker renderResult() method 247
indexed content based 8, 9 replication
query syntax and sharding, combining 298-300
about 99 configuring 291
boosting 107 requestHandler 207
documents, matching 99 request handler
existence (and non-existence) queries 107 about 110
field qualifier 102, 103 configuration, creating 110
fuzzy queries 105 configuring 110
phrase queries 103 result() function 243, 244
query expression, clauses 100 right field type/analysis, using 109
special characters 108 rOfficial 144
sub-expressions 101 rord() 122
term proximity 103 rord(fieldReference) 122
wildcard queries 103, 104 rows parameter 96, 242
rsolr
R versus solr-ruby 269
Ruby On Rails integrations
r_a_name 42 acts_as_solr 254-259
r_attributes 144 acts_as_solr plugin 253
r_event_date_earliest field 138 Blacklight OPAC 263
r_name_facetLetter 148 Convention over Configuration 253
r_official 144 display, customizing 267
r_type 144 fields display, customizing 268, 269
range queries solr-ruby versus rsolr 269
[ and ] brackets 106 solr_data 257
[ 311 ]
- S stored 282
score boosting. See boosting
scale() function scoring
example 123 about 112
inverse reciprocals, using 124, 125 co-ordination factor (coord) 112
logarithms, using 123, 124 factors 112
reciprocals and rord with dates, using field length (fieldNorm) 112
126, 127 Inverse Document Frequency (idf) 112
scale(x,minTarget,maxTarget), query-time and index-time, boosting 113
miscellaneous math 121 term frequency (tf) 112
scale deep 298 troubleshooting 113, 114
scale high 276 script
scale wide 289 versus Java replication 289
schema, Solr search, distributing across slaves
tag 25 about 291
tag 25 master server, indexing into 292
tag 25 slaves, configuring 292, 293
primary key 25 search components
text, field name 25 about 161
schema.xml, settings field collapsing 191, 192
defaultSearchField 47 highlighting component 161
solrconfig.xml 47 MLT (more-like-this) 182
solrQueryParser 47 query elevation 166
uniqueKey 47 spellcheck 169
schema.xml file Stats component 189
tag 40 terms component 194
tag 40 termVector component 194
field definitions 42, 43 search engine 161, 223, 237, 266, 272
field options 40 searcher.num_docs attribute 216
field types 40 SearchHandler
sample 45 per search interface 207
schema design search handler 128
about 34 searching 89, 90
compressed field option 282 server access
data, denormalizing 36 limiting 217, 219
entities returned from search, determining Servlet container
35 and Solr, differences 199
inclusion of fields used in search results, installing in 199
omitting 38, 39 solr.home property, defining 199
indexed 282 sharding
omitNorms 282 and replication, combining 298-300
omitTermFreqAndPositions 282 documents, assigning 296
one to many associated data, denormalizing indexes 295, 296
36, 37 searching across 297, 298
one to one associated data, denormalizing ShingleFilterFactory 288
36 shingling 133, 127, 287
Solr powered search, determining 35
[ 312 ]
- Simple Java interface. See SolrJ issue tracker 27
Simple Logging Facade for Java package. local file accessing, example 68
See SLF4J package logging 201
single combined index mailing list 26
issues 34 official site, URL 11
schema.xml snippet, sample 32 powered artists building, autocomplete
using, issues 33 widget with jQuery used 240, 241, 242
single Solr server powered artists building, autocomplete
optimizing 276 widget with JSONP used 243
single Solr server, optimizing prerequisites 11
faceting performance, enhancing 286 query parameters 95
HTTP caching 277-279 query syntax 99
indexing strategies 283, 284 remote streaming 68, 69
JVM configuration 277 request handlers 110
phrase search performance, improving 287 resources 26
schema design considerations 282 running 17-19
Solr caching 280, 281 sample data, loading 20, 21
term vectors, using 286, 287 schema 25
tuning caches 281 search request handler 128
slaves securing 217
configuring 292 simple query, running 22-24
search queries, distributing across slaves solr.solr.home, searching for 16
293, 294 sorting 109
SLF4j 20 spell check plugin 9
SLF4J package 203 starting 15, 16
SnowballPorterFilterFactory, stemming 54 starting, with JMX 212-215
Solr statistics page 24
about 7, 10 system changes 272
and Servlet container, differences 199 testing 13
building 13 tools 58
communicating with 65 XML, sending to 69, 70
complex systems, tuning 271, 272 XML response format 93
configuration files 25, 26 Solr’s DIH DataImportHandler contrib
cores, managing 209, 210 add-on 66
CSV, sending to 72 Solr’s Wiki 26
deploying 17 Solr, accessing from PHP applications
deployment process 197, 198 about 247, 248
directory structure 13 Drupal, options 250
disjunction-max query handler 9 solr-php-client 248-250
Faceting 141 Solr, communicating with
features 8, 9 convenient client API 65
filtering 108, 109 data formats 66
function query, incorporating to searches data streamed remotely 66
117 Direct HTTP 65
generic XML data structure 92 Solr’s filesystem 66
home directory 15 Solr, data formats
interacting with, curl used 66, 68 rich documents 66
[ 313 ]
- Solr-binary 66 solr.TextField 48
Solr-XML 66 Solr 1.3 11
Solr, examples Solr 1.4 11
structure 223 Solr admin
summary 224 Assistance area 20
Solr, filters example 19
CapitalizationFilterFactory 63 Make a Query text box 20
CharFilterFactory 62 navigation menu 19
ISOLatin1AccentFilterFactory 62 Solr application logging, logging 203
KeepWordFilterFactory 62 Jetty, startup integration 205
LengthFilterFactory 62 Log4j, logging to 204
LowerCaseFilterFactory 62 logging output, configuring 203
PatternReplaceFilterFactory 63 log levels, managing at runtime 205, 206
RemoveDuplicatesTokenFilterFactory 62 solrbook-packtpub 273
StandardFilterFactory 62 Solr caching
write your own 63 autowarmCount 281
Solr, integrating class 281
JavaScript used 238, 239 configuring 281
Solr, prerequisites documentCache 281
Apache ant 11 filterCache 280
Java Development Kit (JDK) 11 queryResultCache 280
Subversion or Git 11 size 281
Solr, securing Solr cell
document access, controlling 221 binary content, extracting 81, 82
index data, securing 220 documents, indexing with 81
JMX access, controlling 220 karaoke lyrics, extracting 83-85
server access, limiting 217, 219, 220 richer documents, indexing 85-87
SOLR-236 191 Solr, configuring 83
solr-balancer 294 Solr cores
Solr-binary 66 cores, managing 209, 210
solr-php-client multicore, need for 210, 211
a_member_name array 249 solr.xml, configuring 208, 209
about 248, 249, 250 solrconfig.xml
Apache_Solr_Service, configuration 249 elements 159
solr-ruby about 75
versus rsolr 269 solrconfig.xml, schema.xml settings 47
Solr-XML 66 Solr DIH Wiki page
solr.body feature 68 URL 79
solr.home property SolrDocumentList object 235
defining 199 SolrDocument object 235
JNDI (Java Naming and Directory Interface) Solr home 16
200 SolrIndexSearch Mbean 214
solr.war file 200 SolrJ
solr.setParser(new XMLResponseParser()) about 65, 224
235 client API 230-233
solr.solr.home CommonsHttpSolrServer 224
searching for 16 embedded Solr, need for 235, 236
[ 314 ]
- EmbeddedSolrServer class 224 classname 173
Heritrix using, to download artist pages dictionary, building from source 176
226, 227 file, spellchecker 172
HTML, indexing 227-230 FileBasedSpellChecker options 175
HTMLStripStandardTokenizerFactory IndexBasedSpellChecker options 174
tokenizer 227 indexed content 169
POJOs, indexing 234, 235 jarowinkler, spellchecker 172
stream.file parameter 224 mispelled query, example 178, 180
Solr JIRA name 173
URL 12 q parameter, processing 175
SolrJS requests, issuing 177, 178
about 245, 246 schema configuration 169-171
addWidget() method 247 solrconfig.xml, configuration in 171, 172
project homepage, URL 245 Solr configuring, ways 169
SolrJS Manager object 247 spellcheck.q parameter, processing 176
URL 220 spellchecker, index and file based 173
Solrmarc 236 spellcheckers (dictionaries), configuring
SolrQuery object 235 173
solrQueryParser, schema.xml settings 47 spellcheckIndexDir 173
Solr resources text file of words 169
about 26 spellcheck.collate 178
issue tracker 27 spellcheck.count 177
mailing lists 26 spellcheck.dictionary 177
Solr’s Wiki 26 spellcheck.extendedResults 178
Solr search components spellcheck.onlyMorePopular 178
LocalSolr component 194 spellcheck.q 177
terms component 194 spellcheck.q parameter
termVector component 194 processing 176
sort, output related parameter 97 spellchecker, index and file based
sorting accuracy 174
about 44, 109 buildOnCommit 174
limitations 44 buildOnOptimize 174
string type 45 classname 173
title_sort type 45 distanceMeasure 174
sortMissingFirst, field option 41 fieldType 174
sortMissingLast, field option 41 name 173
Soundex, phonetic encoding algorithms 58 spellcheckIndexDir 173
sourceLocation, FileBasedSpellChecker spellcheckIndexDir 173
option 175 spell check plugin 9
sourceLocation, IndexBasedSpellChecker Splunk 205
option 174 sqrt(x), mathematical primitives 121
spellcheck 177 Squid
spellcheck, search components URL 279
a_spell, spellchecker 172 standard component list 160
a_spellPhrase, spellchecker 172 StandardFilterFactory filter 62
about 169 StandardTokenizerFactory 52
alternative approach 180, 182 start 93
[ 315 ]
- startEmbeddedSolr() 234 EdgeNGramFilterFactory 61
start parameter 96 EdgeNGramTokenizerFactory 61
stats, Stats component 189 n-gramming costs 61
stats.facet, Stats component 190 NGramFilterFactory, configuring with min-
stats.field, Stats component 189 GramSize of 2 60
Stats component, search components NGramFilterFactory, configuring with min-
about 189 GramSize of 5 60
configuring 189 Subversion
count 189 URL 11
max 189 sum(x,y,z, ... ), mathematical primitives 121
mean 189 sum, Stats component 189
min 189 sumOfSquares, Stats component 189
missing 189 synonyms
statistics, for track durations 190 => 56
stats 189 about 55
stats.facet 190 ignoreCase, setting true 56
stats.field 189 index-time versus query-time 57
stddev 189 WordNet, thesarus 55
sum 189
sumOfSquares 189 T
status 93
stddev, Stats component 189 t_duration 152
stemming t_shingle 288
about 54 t_trm_lookups 118
EnglishPorterFilterFactory 54 Tailing 202
implementations 54 term-suggest 141, 156
KStem 55 term frequency. See tf
PorterStemFilterFactory 54 term proximity 103
SnowballPorterFilterFactory 54 terms component 194
StopFilterFactory 186 termVector component 194
used, for stop words filtering 57 termVectors 186
stop words term vectors 286, 287
filtering, StopFilterFactory used 57 termVectors (advanced), field option 41
stored, field option 41 text analysis
stored, schema design 282 about 47
stream.body parameter 67 experimenting with 50, 51
stream.file parameter 67, 224 highlight matches 51
stream.url parameter 67 index box 51
StreamingUpdateSolrServer 284 multi-word synonyms 56
str element 92 n-gram 60
string type 45 n-gramming costs 61, 62
sub-expressions partial indexing 60
about 101 phonetic sounds-like 58
prohibited clause, limitations 102 query box 51
substring indexing stemming 54, 55
about 60 stop words 58
analyzer configuration, n-grams used 60 substring indexing 60
synonyms 55
[ 316 ]
- term text 51 wildcard queries
text field type 50 about 103, 104
text field type definition, configuration 48 fuzzy queries 105
text field type definition, configuring 49 WordDelimeterFilterFactory 51
tokenizer 52 WordDelimeterFilterFactory,
verbose output 51 tokenizer action 50
WordDelimiter analyzer 53 WordDelimiter analyzer
WordDelimiterFilterFactory 53 splitting, ways 53, 54
WorkDelimiterFilterFactory 54 tokenizing, ways 53, 54
text field type 50 WordDelimiterFilterFactory 53
tf 112 WordNet thesarus 55
threaded_test.rb script 283, 284 write your own filter 63
thresholdTokenFrequency, wt, output related parameter 97
IndexBasedSpellChecker option 175
title_sort type 45 X
tokenizer
about 50 XML, sending to Solr
HTMLStripStandardTokenizerFactory 52 about 69, 70
HTMLStripWhitespaceTokenizerFactory 52 changes, committing 71
KeywordTokenizerFactory 52 commit and optimize 71
LetterTokenizerFactory 52 documents, deleting 70
PatternTokenizerFactory 53 rollback command 71
StandardTokenizerFactory 52 uncommitted changes, withdrawing 71
WhitespaceTokenizerFactory 52 XML response format
Tomcat 199 93
TPS 272 93
about 93
maxScore 93
U numFound 93
uniqueKey, schema.xml settings 47 QTime 93
uniqueKey field 232, 233 start 93
status 93
V URL, parsing 94
version, output related parameter 98 Y
Vigilog
URL 204 y, argument 120
W Z
WAR 199 zip format 292
web.xml
customizing, in Jetty 218
Web application archive. See WAR
WebTrends 202
WhitespaceTokenizerFactory 52
[ 317 ]
- Thank you for buying
Solr 1.4 Enterprise Search Server
Packt Open Source Project Royalties
When we sell a book written on an Open Source project, we pay a royalty directly to that
project. Therefore by purchasing Solr 1.4 Enterprise Search Server, Packt will have given some
of the money received to the Apache Solr project.
In the long term, we see ourselves and you—customers and readers of our books—as part of
the Open Source ecosystem, providing sustainable revenue for the projects we publish on.
Our aim at Packt is to establish publishing royalties as an essential part of the service and
support a business model that sustains Open Source.
If you're working with an Open Source project that you would like us to publish on, and
subsequently pay royalties to, please get in touch with us.
Writing for Packt
We welcome all inquiries from people who are interested in authoring. Book proposals
should be sent to author@packtpub.com. If your book idea is still at an early stage and you
would like to discuss it first before writing a formal book proposal, contact us; one of our
commissioning editors will get in touch with you.
We're not just looking for published authors; if you have strong technical skills but no writing
experience, our experienced editors can help you develop a writing career, or simply get some
additional reward for your expertise.
About Packt Publishing
Packt, pronounced 'packed', published its first book "Mastering phpMyAdmin for Effective
MySQL Management" in April 2004 and subsequently continued to specialize in publishing
highly focused books on specific technologies and solutions.
Our books and publications share the experiences of your fellow IT professionals in adapting
and customizing today's systems, applications, and frameworks. Our solution-based books
give you the knowledge and power to customize the software and technologies you're using
to get the job done. Packt books are more specific and less general than the IT books you have
seen in the past. Our unique business model allows us to bring you more focused information,
giving you more of what you need to know, and less of what you don't.
Packt is a modern, yet unique publishing company, which focuses on producing quality,
cutting-edge books for communities of developers, administrators, and newbies alike. For
more information, please visit our website: www.PacktPub.com.
- JasperReports for Java
Developers
ISBN: 1-904811-90-6 Paperback: 344 pages
Create, Design, Format, and Export Reports with the
world's most popular Java reporting library
1. Get started with JasperReports, and develop the
skills to get the most from it
2. Create, design, format, and export reports
3. Generate report data from a wide range of
datasources
4. Integrate Jasper Reports with Spring,
Hibernate, Java Server Faces, or Struts
JBoss Portal Server Development
ISBN: 978-1-847194-10-7 Paperback: 276 pages
Create dynamic, feature-rich, and robust enterprise
portal applications
1. Complete guide with examples for building
enterprise portal applications using the
free, open-source standards-based JBoss
portal server
2. Quickly build portal applications such as B2B
web sites or corporate intranets
3. Practical approach to understanding concepts
such as personalization, single sign-on,
integration with web technologies, and
content management
Please check www.PacktPub.com for information on our titles
nguon tai.lieu . vn