Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing: enable Dataverse to use Solr in a distributed environment #1083

Closed
kcondon opened this issue Nov 4, 2014 · 4 comments
Closed

Indexing: enable Dataverse to use Solr in a distributed environment #1083

kcondon opened this issue Nov 4, 2014 · 4 comments
Assignees
Labels
Type: Feature a feature request UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@kcondon
Copy link
Contributor

kcondon commented Nov 4, 2014

Dataverse is currently coded to connect to solr on localhost. In order to run in a distributed environment, the app needs to be able to connect to a remote solr instance. Ideally this would be done with solr cloud and a zookeeper ensemble. A list of zookeepers would be stored somewhere which the app would use to initialize a CloudSolrServer object.

@kcondon kcondon added Type: Feature a feature request UX & UI: Design This issue needs input on the design of the UI and from the product owner Status: Dev labels Nov 4, 2014
@kcondon kcondon added this to the Beta 9 - Dataverse 4.0 milestone Nov 4, 2014
pdurbin added a commit that referenced this issue Nov 10, 2014
- added `SolrHostColonPort` config option
- Updated Vagrant to allow spin up of multiple VMs
@pdurbin
Copy link
Member

pdurbin commented Nov 12, 2014

In 57dc52c I enabled a new SolrHostColonPort setting and documented it at https://github.com/IQSS/dataverse/blob/master/doc/Sphinx/source/Installers/dataverse-installer-main.rst#solrhostcolonport

Here's an example of how to use it:

curl -X PUT http://localhost:8080/api/s/settings/:SolrHostColonPort/localhost:8983

@kcondon is this good enough for Dataverse 4.0 or is the use of ZooKeeper a hard requirement? Can we defer ZooKeeper to a future release? I'll put this in QA so you can at least test this new setting.

@pdurbin pdurbin assigned kcondon and unassigned pdurbin Nov 12, 2014
@kcondon
Copy link
Contributor Author

kcondon commented Nov 13, 2014

OK, this config param works. Will need to test on multi server env. As for whether it is ok for prod, maybe a discussion for others as well?

@kcondon
Copy link
Contributor Author

kcondon commented Nov 17, 2014

Tested on multiple systems and they are both able to update index and see each other's updates. Do not know yet how it performs under load.

Benson still hopes we can use Zookeeper and thinks the config isn't that much more involved.

Closing as basic functionality has been delivered.

@kcondon kcondon closed this as completed Nov 17, 2014
@pdurbin
Copy link
Member

pdurbin commented Nov 17, 2014

Benson still hopes we can use Zookeeper and thinks the config isn't that much more involved.

Right, he showed me we'd start Solr like this:

java -DzkRun -Dboostrap_confdir=solr/collection1/conf -Dbootstrap_conf=true -jar start.jar

At least that's how developers would start Solr with a single collection/core/thing. In production ZooKeeper would manage multiple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature a feature request UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
None yet
Development

No branches or pull requests

2 participants