Skip to content

Commit

Permalink
Merge branch 'develop' into devel-work-experiments
Browse files Browse the repository at this point in the history
* develop:
  #891 use better field analyzer for shelfmark
  ...and more formatting
  As usual, fix formatting
  Add more solr installation notes
  • Loading branch information
xhero committed Jun 28, 2021
2 parents 52a1eb6 + 572aed4 commit 1796000
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 52 deletions.
30 changes: 30 additions & 0 deletions 2- INSTALL.rdoc
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,36 @@ Muscat 7.1 now supports Solr 8.8 installed externally. This means for example th
The internal Solr 5.5 server still works, but will not be supported anymore in the future since the bundled Solr unsupported and has problems with modern Java versions.
The Muscat core should work with any installation of Solr, here a sample installation procedure is shown but it should be adapted for each installation.

==== Installation using a "stock" solr using included setup script

Grab a copy of the official distribution (8.8.2 at the time of writing) and unpack it in a suitable place

wget https://www.apache.org/dyn/closer.lua/lucene/solr/8.8.2/solr-8.8.2.tgz
tar -xvzf solr-8.8.2.tgz

Solr needs two directories, one for data and one for the binaries. The latter is generally /opt/solr, and the data installation for this example will be /data/solr.
Extract just the installation script:

tar xzf solr-8.8.2.tgz solr-8.8.2/bin/install_solr_service.sh --strip-components=2

And execute it:

sudo ./install_solr_service.sh solr-8.8.2.tgz -d /data/solr -p 8983

This will create the directory layout as above. It will also create a +solr+ user. For the complete reference see https://solr.apache.org/guide/8_8/taking-solr-to-production.html
Now copy over the Muscat core:

cp -R $MUSCAT_HOME/solr-configuration/muscat /data/solr/data
cp $MUSCAT_HOME/solr-configuration/jar-8.8-linux/ThemaxQuery-1.0-SNAPSHOT.jar /data/solr/data/lib/

Where +/data/solr/+ in the directory specified above with -d.
Solr includes a startup script so

systemctl daemon-reload
service solr start

Should start it. The logfiles in this case are found in +/data/solr/logs+.

==== Installation using a "stock" solr

Grab a copy of the official distribution (8.8.2 at the time of writing) and unpack it in a suitable place
Expand Down
21 changes: 1 addition & 20 deletions solr-configuration/muscat/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -189,30 +189,11 @@
</analyzer>
</fieldtype>

<fieldType name="text_alphanumeric_sort" class="solr.TextField" sortMissingLast="false" omitNorms="true">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="^(a |the |les |la |le |l'|de la |du |des )" replacement="" replace="all"
/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="(\d+)" replacement="00000$1" replace="all"
/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="0*([0-9]{6,})" replacement="$1" replace="all"
/>
<filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all"
/>
</analyzer>
</fieldType>
<fieldType name="text_alphanumeric_sort" class="solr.ICUCollationField" locale="" numeric="true" strength="secondary" sortMissingLast="true" />
<!-- END ADDED BY RZ FOR MUSCAT -->
</types>
<uniqueKey>id</uniqueKey>

<copyField source="*_text" dest="textSpell"/>
<copyField source="*_s" dest="textSpell"/>
</schema>

2 changes: 0 additions & 2 deletions solr-configuration/muscat/solrconfig.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
<lib dir="${solr.install.dir:../../../..}/contrib/analysis-extras/lib" regex="icu4j-\d.*\.jar"/>
<lib dir="${solr.install.dir:../../../..}/contrib/analysis-extras/lucene-libs"
regex="lucene-analyzers-icu-\d.*\.jar"/>
<lib dir="../../jar"/>

<dataDir>${solr.data.dir:}</dataDir>

Expand Down Expand Up @@ -249,4 +248,3 @@
</lst>
</requestHandler>
</config>

31 changes: 1 addition & 30 deletions solr/configsets/sunspot/conf/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -112,36 +112,7 @@
</analyzer>
</fieldtype>

<fieldType name="text_alphanumeric_sort" class="solr.TextField" sortMissingLast="false" omitNorms="true">
<analyzer>
<!-- KeywordTokenizer does no actual tokenizing, so the entire
input string is preserved as a single token
-->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<!-- The LowerCase TokenFilter does what you expect, which can be
when you want your sorting to be case insensitive
-->
<filter class="solr.LowerCaseFilterFactory" />
<!-- The TrimFilter removes any leading or trailing whitespace -->
<filter class="solr.TrimFilterFactory" />
<!-- Remove leading articles -->
<filter class="solr.PatternReplaceFilterFactory"
pattern="^(a |the |les |la |le |l'|de la |du |des )" replacement="" replace="all"
/>
<!-- Left-pad numbers with zeroes -->
<filter class="solr.PatternReplaceFilterFactory"
pattern="(\d+)" replacement="00000$1" replace="all"
/>
<!-- Left-trim zeroes to produce 6 digit numbers -->
<filter class="solr.PatternReplaceFilterFactory"
pattern="0*([0-9]{6,})" replacement="$1" replace="all"
/>
<!-- Remove all but alphanumeric characters -->
<filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all"
/>
</analyzer>
</fieldType>
<fieldType name="text_alphanumeric_sort" class="solr.ICUCollationField" locale="" numeric="true" strength="secondary" sortMissingLast="true" />
<!-- END ADDED BY RZ FOR MUSCAT -->

</types>
Expand Down

0 comments on commit 1796000

Please sign in to comment.