diff --git a/.gitignore b/.gitignore index 25dbdb29064..7077e324a2f 100644 --- a/.gitignore +++ b/.gitignore @@ -33,3 +33,11 @@ scripts/api/setup-all.sh* # ctags generated tag file tags + +# dependencies I'm not sure we're allowed to redistribute / have in version control +conf/docker-aio/dv/deps/ + +# no need to check aoi installer zip into vc +conf/docker-aio/dv/install/dvinstall.zip +# or copy of test data +conf/docker-aio/testdata/ diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f9d658c31d3..83b7c2d0cea 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,12 +1,12 @@ # Contributing to Dataverse -Thank you for your interest in contributing to Dataverse! We are open to contributions from everyone. You don't need permission to participate, just jump in using the resources below. If you have questions, reach out to us on the [#dataverse IRC channel][], and hang around a while, as it may take time for community members to de-idle. +Thank you for your interest in contributing to Dataverse! We are open to contributions from everyone. You don't need permission to participate. Just jump in. If you have questions, please reach out using one or more of the channels described below. -We aren't just looking for developers, there are many ways to contribute to Dataverse. We welcome contributions of ideas, bug reports, usability research/feedback, documentation, code, and more! +We aren't just looking for developers. There are many ways to contribute to Dataverse. We welcome contributions of ideas, bug reports, usability research/feedback, documentation, code, and more! ## Ideas/Feature Requests -Your idea or feature request might already be captured in the Dataverse [issue tracker] on GitHub but if not, the best way to bring it to the community's attention is by posting on the [dataverse-community Google Group][]. You're also welcome make some noise in the [#dataverse IRC channel][] (which is [logged][]) or cram your idea into 140 characters and mention [@dataverseorg][] on Twitter. To discuss your idea privately, please email it to support@dataverse.org +Your idea or feature request might already be captured in the Dataverse [issue tracker] on GitHub but if not, the best way to bring it to the community's attention is by posting on the [dataverse-community Google Group][] or bringing it up on a [Community Call][]. You're also welcome make some noise in the [#dataverse IRC channel][] (which is [logged][]) or cram your idea into 280 characters and mention [@dataverseorg][] on Twitter. To discuss your idea privately, please email it to support@dataverse.org There's a chance your idea is already on our roadmap, which is available at http://dataverse.org/goals-roadmap-and-releases @@ -14,9 +14,6 @@ There's a chance your idea is already on our roadmap, which is available at http [logged]: http://irclog.iq.harvard.edu/dataverse/today [issue tracker]: https://github.com/IQSS/dataverse/issues [@dataverseorg]: https://twitter.com/dataverseorg -[Functional Requirements Document (FRD for short)]: https://docs.google.com/document/d/1PRyAlP6zlUlUuHfgyUezzuaVQ4JnapvgtGWo0o7tLEs/edit?usp=sharing -[Balsamiq]: https://iqssharvard.mybalsamiq.com/projects -[Functional Requirements Document folder on Google Drive]: https://drive.google.com/folderview?id=0B3_V6vFxEcx-fl92ek92OG1nTmhQenBRX1Z4OVJBLXpURmh2d2RyX1NZRUp6YktaYUU5YTA&usp=sharing ## Usability testing @@ -26,15 +23,15 @@ Please email us at support@dataverse.org if you are interested in participating An issue is a bug (a feature is no longer behaving the way it should) or a feature (something new to Dataverse that helps users complete tasks). You can browse the Dataverse [issue tracker] on GitHub by open or closed issues or by milestones. -Before submitting an issue, please search the existing issues by using the search bar at the top of the page. If there is an existing issue that matches the issue you want to report, please add a comment to it. +Before submitting an issue, please search the existing issues by using the search bar at the top of the page. If there is an existing open issue that matches the issue you want to report, please add a comment to it. -If there is no pre-existing issue, please click on the "New Issue" button, log in, and write in what the issue is (unless it is a security issue which should be reported privately to security@dataverse.org). +If there is no pre-existing issue or it has been closed, please click on the "New Issue" button, log in, and write in what the issue is (unless it is a security issue which should be reported privately to security@dataverse.org). If you do not receive a reply to your new issue or comment in a timely manner, please email support@dataverse.org with a link to the issue. ### Writing an Issue -For the subject of an issue, please start it by writing the feature or functionality it relates to, i.e. "Create Account:..." or "Dataset Page:...". In the body of the issue, please outline the issue you are reporting with as much detail as possible. In order for the Dataverse development team to best respond to the issue, we need as much information about the issue as you can provide. Include steps to reproduce bugs. Indicate which version you're using. We love screenshots! +For the subject of an issue, please start it by writing the feature or functionality it relates to, i.e. "Create Account:..." or "Dataset Page:...". In the body of the issue, please outline the issue you are reporting with as much detail as possible. In order for the Dataverse development team to best respond to the issue, we need as much information about the issue as you can provide. Include steps to reproduce bugs. Indicate which version you're using, which is shown at the bottom of the page. We love screenshots! ### Issue Attachments @@ -51,13 +48,20 @@ The source for the documentation at http://guides.dataverse.org/en/latest/ is in ## Code/Pull Requests -Before you start coding, please reach out to us either on the [dataverse-community Google Group][], the [dataverse-dev Google Group][], [IRC][] (#dataverse on freenode), or via support@dataverse.org to make sure the effort is well coordinated and we avoid merge conflicts. +We love code contributions. Developers are not limited to the main Dataverse code in this git repo. You can help with API client libraries in your favorite language that are mentioned in the [API Guide][] or create a new library. You can help work on configuration management code that's mentioned in the [Installation Guide][]. The Installation Guide also covers a new concept called "external tools" that allows developers to create their own tools that are available from within an installation of Dataverse. -Please read http://guides.dataverse.org/en/latest/developers/version-control.html to understand how we use the "git flow" model of development and how we will encourage you to create a GitHub issue (if it doesn't exist already) to associate with your pull request. +[API Guide]: http://guides.dataverse.org/en/latest/api +[Installation Guide]: http://guides.dataverse.org/en/latest/installation -After making your pull request, your goal should be to help it advance through our kanban board at https://waffle.io/IQSS/dataverse . If no one has moved your pull request to the code review column in a timely manner, please reach out. We maintain a list of [community contributors][] so please let us know if you'd like to be added or removed from the list. Thanks! +If you are interested in working on the main Dataverse code, great! Before you start coding, please reach out to us either on the [dataverse-community Google Group][], the [dataverse-dev Google Group][], [IRC][] (#dataverse on freenode), or via support@dataverse.org to make sure the effort is well coordinated and we avoid merge conflicts. We maintain a list of [community contributors][] and [dev efforts][] the community is working on so please let us know if you'd like to be added or removed from either list. + +Please read http://guides.dataverse.org/en/latest/developers/version-control.html to understand how we use the "git flow" model of development and how we will encourage you to create a GitHub issue (if it doesn't exist already) to associate with your pull request. That page also includes tips on making a pull request. + +After making your pull request, your goal should be to help it advance through our kanban board at https://waffle.io/IQSS/dataverse . If no one has moved your pull request to the code review column in a timely manner, please reach out. Thanks! [dataverse-community Google Group]: https://groups.google.com/group/dataverse-community +[Community Call]: https://dataverse.org/community-calls [dataverse-dev Google Group]: https://groups.google.com/group/dataverse-dev [IRC]: http://chat.dataverse.org [community contributors]: https://docs.google.com/spreadsheets/d/1o9DD-MQ0WkrYaEFTD5rF_NtyL8aUISgURsAXSL7Budk/edit?usp=sharing +[dev efforts]: https://groups.google.com/d/msg/dataverse-community/X2diSWYll0w/ikp1TGcfBgAJ diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 00000000000..5f492ea0594 --- /dev/null +++ b/Dockerfile @@ -0,0 +1 @@ +# See `conf/docker` for Docker images diff --git a/Vagrantfile b/Vagrantfile index 5df7800195f..b3c6e7b39a9 100644 --- a/Vagrantfile +++ b/Vagrantfile @@ -8,18 +8,13 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.define "standalone", primary: true do |standalone| config.vm.hostname = "standalone" - standalone.vm.box = "puppet-vagrant-boxes.puppetlabs.com-centos-65-x64-virtualbox-puppet.box" + # Uncomment this temporarily to get `vagrant destroy` to work + #standalone.vm.box = "puppetlabs/centos-7.2-64-puppet" operating_system = "centos" if ENV['OPERATING_SYSTEM'].nil? - puts "OPERATING_SYSTEM environment variable not specified. Using #{operating_system} by default.\nTo specify it in bash: export OPERATING_SYSTEM=debian" - config.vm.box_url = "http://puppet-vagrant-boxes.puppetlabs.com/centos-65-x64-virtualbox-puppet.box" - config.vm.box = "puppet-vagrant-boxes.puppetlabs.com-centos-65-x64-virtualbox-puppet.box" - elsif ENV['OPERATING_SYSTEM'] == 'centos7' - puts "WARNING: CentOS 7 specified. Newer than what the dev team tests on." - config.vm.box_url = "https://atlas.hashicorp.com/puppetlabs/boxes/centos-7.2-64-puppet/versions/1.0.1/providers/virtualbox.box" - config.vm.box = "puppetlabs-centos-7.2-64-puppet-1.0.1-virtualbox.box" - standalone.vm.box = "puppetlabs-centos-7.2-64-puppet-1.0.1-virtualbox.box" + config.vm.box = "puppetlabs/centos-7.2-64-puppet" + config.vm.box_version = '1.0.1' elsif ENV['OPERATING_SYSTEM'] == 'debian' puts "WARNING: Debian specified. Here be dragons! https://github.com/IQSS/dataverse/issues/1059" config.vm.box_url = "http://puppet-vagrant-boxes.puppetlabs.com/debian-73-x64-virtualbox-puppet.box" diff --git a/conf/docker-aio/0prep_deps.sh b/conf/docker-aio/0prep_deps.sh new file mode 100755 index 00000000000..056bec03eb6 --- /dev/null +++ b/conf/docker-aio/0prep_deps.sh @@ -0,0 +1,27 @@ +#!/bin/sh +if [ ! -d dv/deps ]; then + mkdir -p dv/deps +fi +wdir=`pwd` +if [ ! -e dv/deps/glassfish4dv.tgz ]; then + echo "glassfish dependency prep" + mkdir -p /tmp/dv-prep/gf + cd /tmp/dv-prep/gf + wget http://download.java.net/glassfish/4.1/release/glassfish-4.1.zip + wget http://search.maven.org/remotecontent?filepath=org/jboss/weld/weld-osgi-bundle/2.2.10.Final/weld-osgi-bundle-2.2.10.Final-glassfish4.jar -O weld-osgi-bundle-2.2.10.Final-glassfish4.jar + unzip glassfish-4.1.zip + rm glassfish4/glassfish/modules/weld-osgi-bundle.jar + mv weld-osgi-bundle-2.2.10.Final-glassfish4.jar glassfish4/glassfish/modules + tar zcf $wdir/dv/deps/glassfish4dv.tgz glassfish4 + cd $wdir + # assuming that folks usually have /tmp auto-clean as needed +fi + +if [ ! -e dv/deps/solr-4.6.0dv.tgz ]; then + echo "solr dependency prep" + # schema changes *should* be the only ones... + cd dv/deps/ + wget https://archive.apache.org/dist/lucene/solr/4.6.0/solr-4.6.0.tgz -O solr-4.6.0dv.tgz + cd ../../ +fi + diff --git a/conf/docker-aio/1prep.sh b/conf/docker-aio/1prep.sh new file mode 100755 index 00000000000..88642a89d78 --- /dev/null +++ b/conf/docker-aio/1prep.sh @@ -0,0 +1,24 @@ +#!/bin/sh + +export LANG="en_US.UTF-8" + +# move things necessary for integration tests into build context. +# this was based off the phoenix deployment; and is likely uglier and bulkier than necessary in a perfect world + +mkdir -p testdata/doc/sphinx-guides/source/_static/util/ +cp ../solr/4.6.0/schema.xml testdata/ +cp ../jhove/jhove.conf testdata/ +cd ../../ +cp -r scripts conf/docker-aio/testdata/ +cp doc/sphinx-guides/source/_static/util/pg8-createsequence-prep.sql conf/docker-aio/testdata/doc/sphinx-guides/source/_static/util/ +cp doc/sphinx-guides/source/_static/util/createsequence.sql conf/docker-aio/testdata/doc/sphinx-guides/source/_static/util/ + +# not using dvinstall.zip for setupIT.bash; but still used in install.bash for normal ops +mvn clean +./scripts/database/homebrew/custom-build-number +mvn package +cd scripts/installer +make clean +make +cp dvinstall.zip ../../conf/docker-aio/dv/install/ + diff --git a/conf/docker-aio/c7.dockerfile b/conf/docker-aio/c7.dockerfile new file mode 100644 index 00000000000..cf589ff1525 --- /dev/null +++ b/conf/docker-aio/c7.dockerfile @@ -0,0 +1,40 @@ +FROM centos:7 +# OS dependencies +RUN yum install -y java-1.8.0-openjdk-headless postgresql-server sudo epel-release unzip perl curl +RUN yum install -y jq + +# copy and unpack dependencies (solr, glassfish) +COPY dv /tmp/dv +COPY testdata/schema.xml /tmp/dv +RUN cd /opt ; tar zxf /tmp/dv/deps/solr-4.6.0dv.tgz +RUN cd /opt ; tar zxf /tmp/dv/deps/glassfish4dv.tgz + +RUN sudo -u postgres /usr/bin/initdb -D /var/lib/pgsql/data +#RUN sudo -u postgres createuser dvnapp + +# copy configuration related files +RUN cp /tmp/dv/pg_hba.conf /var/lib/pgsql/data/ ; cp /tmp/dv/schema.xml /opt/solr-4.6.0/example/solr/collection1/conf/schema.xml + +# skipping glassfish user and solr user (run both as root) + +#solr port +EXPOSE 8983 + +# postgres port +EXPOSE 5432 + +# glassfish port +EXPOSE 8080 + +RUN mkdir /opt/dv + +# yeah - still not happy if glassfish isn't in /usr/local :< +RUN ln -s /opt/glassfish4 /usr/local/glassfish4 +COPY dv/install/ /opt/dv/ +COPY install.bash /opt/dv/ +COPY entrypoint.bash /opt/dv/ +COPY testdata /opt/dv/testdata +COPY testscripts/* /opt/dv/testdata/ +COPY setupIT.bash /opt/dv +WORKDIR /opt/dv +CMD ["/opt/dv/entrypoint.bash"] diff --git a/conf/docker-aio/default.config b/conf/docker-aio/default.config new file mode 100644 index 00000000000..7c99866be17 --- /dev/null +++ b/conf/docker-aio/default.config @@ -0,0 +1,16 @@ +HOST_DNS_ADDRESS localhost +GLASSFISH_DIRECTORY /opt/glassfish4 +ADMIN_EMAIL +MAIL_SERVER mail.hmdc.harvard.edu +POSTGRES_ADMIN_PASSWORD secret +POSTGRES_SERVER db +POSTGRES_PORT 5432 +POSTGRES_DATABASE dvndb +POSTGRES_USER dvnapp +POSTGRES_PASSWORD secret +SOLR_LOCATION idx +TWORAVENS_LOCATION NOT INSTALLED +RSERVE_HOST localhost +RSERVE_PORT 6311 +RSERVE_USER rserve +RSERVE_PASSWORD rserve diff --git a/conf/docker-aio/dv/install/default.config b/conf/docker-aio/dv/install/default.config new file mode 100644 index 00000000000..7c99866be17 --- /dev/null +++ b/conf/docker-aio/dv/install/default.config @@ -0,0 +1,16 @@ +HOST_DNS_ADDRESS localhost +GLASSFISH_DIRECTORY /opt/glassfish4 +ADMIN_EMAIL +MAIL_SERVER mail.hmdc.harvard.edu +POSTGRES_ADMIN_PASSWORD secret +POSTGRES_SERVER db +POSTGRES_PORT 5432 +POSTGRES_DATABASE dvndb +POSTGRES_USER dvnapp +POSTGRES_PASSWORD secret +SOLR_LOCATION idx +TWORAVENS_LOCATION NOT INSTALLED +RSERVE_HOST localhost +RSERVE_PORT 6311 +RSERVE_USER rserve +RSERVE_PASSWORD rserve diff --git a/conf/docker-aio/dv/pg_hba.conf b/conf/docker-aio/dv/pg_hba.conf new file mode 100644 index 00000000000..77feba5247d --- /dev/null +++ b/conf/docker-aio/dv/pg_hba.conf @@ -0,0 +1,91 @@ +# PostgreSQL Client Authentication Configuration File +# =================================================== +# +# Refer to the "Client Authentication" section in the PostgreSQL +# documentation for a complete description of this file. A short +# synopsis follows. +# +# This file controls: which hosts are allowed to connect, how clients +# are authenticated, which PostgreSQL user names they can use, which +# databases they can access. Records take one of these forms: +# +# local DATABASE USER METHOD [OPTIONS] +# host DATABASE USER ADDRESS METHOD [OPTIONS] +# hostssl DATABASE USER ADDRESS METHOD [OPTIONS] +# hostnossl DATABASE USER ADDRESS METHOD [OPTIONS] +# +# (The uppercase items must be replaced by actual values.) +# +# The first field is the connection type: "local" is a Unix-domain +# socket, "host" is either a plain or SSL-encrypted TCP/IP socket, +# "hostssl" is an SSL-encrypted TCP/IP socket, and "hostnossl" is a +# plain TCP/IP socket. +# +# DATABASE can be "all", "sameuser", "samerole", "replication", a +# database name, or a comma-separated list thereof. The "all" +# keyword does not match "replication". Access to replication +# must be enabled in a separate record (see example below). +# +# USER can be "all", a user name, a group name prefixed with "+", or a +# comma-separated list thereof. In both the DATABASE and USER fields +# you can also write a file name prefixed with "@" to include names +# from a separate file. +# +# ADDRESS specifies the set of hosts the record matches. It can be a +# host name, or it is made up of an IP address and a CIDR mask that is +# an integer (between 0 and 32 (IPv4) or 128 (IPv6) inclusive) that +# specifies the number of significant bits in the mask. A host name +# that starts with a dot (.) matches a suffix of the actual host name. +# Alternatively, you can write an IP address and netmask in separate +# columns to specify the set of hosts. Instead of a CIDR-address, you +# can write "samehost" to match any of the server's own IP addresses, +# or "samenet" to match any address in any subnet that the server is +# directly connected to. +# +# METHOD can be "trust", "reject", "md5", "password", "gss", "sspi", +# "krb5", "ident", "peer", "pam", "ldap", "radius" or "cert". Note that +# "password" sends passwords in clear text; "md5" is preferred since +# it sends encrypted passwords. +# +# OPTIONS are a set of options for the authentication in the format +# NAME=VALUE. The available options depend on the different +# authentication methods -- refer to the "Client Authentication" +# section in the documentation for a list of which options are +# available for which authentication methods. +# +# Database and user names containing spaces, commas, quotes and other +# special characters must be quoted. Quoting one of the keywords +# "all", "sameuser", "samerole" or "replication" makes the name lose +# its special character, and just match a database or username with +# that name. +# +# This file is read on server startup and when the postmaster receives +# a SIGHUP signal. If you edit the file on a running system, you have +# to SIGHUP the postmaster for the changes to take effect. You can +# use "pg_ctl reload" to do that. + +# Put your actual configuration here +# ---------------------------------- +# +# If you want to allow non-local connections, you need to add more +# "host" records. In that case you will also need to make PostgreSQL +# listen on a non-local interface via the listen_addresses +# configuration parameter, or via the -i or -h command line switches. + + + +# TYPE DATABASE USER ADDRESS METHOD + +# "local" is for Unix domain socket connections only +#local all all peer +local all all trust +# IPv4 local connections: +#host all all 127.0.0.1/32 trust +host all all 0.0.0.0/0 trust +# IPv6 local connections: +host all all ::1/128 trust +# Allow replication connections from localhost, by a user with the +# replication privilege. +#local replication postgres peer +#host replication postgres 127.0.0.1/32 ident +#host replication postgres ::1/128 ident diff --git a/conf/docker-aio/entrypoint.bash b/conf/docker-aio/entrypoint.bash new file mode 100755 index 00000000000..fe8b3030677 --- /dev/null +++ b/conf/docker-aio/entrypoint.bash @@ -0,0 +1,10 @@ +#!/usr/bin/env bash + +sudo -u postgres /usr/bin/postgres -D /var/lib/pgsql/data & +cd /opt/solr-4.6.0/example/ +java -DSTOP.PORT=8079 -DSTOP.KEY=a09df7a0d -jar start.jar & + +cd /opt/glassfish4 +bin/asadmin start-domain +sleep infinity + diff --git a/conf/docker-aio/install.bash b/conf/docker-aio/install.bash new file mode 100755 index 00000000000..6f1ed2fa5dd --- /dev/null +++ b/conf/docker-aio/install.bash @@ -0,0 +1,9 @@ +#!/usr/bin/env bash +sudo -u postgres createuser --superuser dvnapp +#./entrypoint.bash & +unzip dvinstall.zip +cd dvinstall/ +./install -admin_email=pameyer+dvinstall@crystal.harvard.edu -y -f > install.out 2> install.err + +echo "installer complete" +cat install.err diff --git a/conf/docker-aio/readme.txt b/conf/docker-aio/readme.txt new file mode 100644 index 00000000000..7ef7247c23f --- /dev/null +++ b/conf/docker-aio/readme.txt @@ -0,0 +1,30 @@ +first pass docker all-in-one image, intended for running integration tests against. + +Could be potentially usable for normal development as well. + + +Initial setup (aka - do once): +- Do surgery on glassfish4 and solr4.6.0 following guides, place results in `conf/docker-aio/dv/deps` as `glassfish4dv.tgz` and `solr-4.6.0dv.tgz` respectively. If you `cd conf/docker-aio` and run `./0prep_deps.sh` these tarballs will be constructed for you. + +Per-build: +- `cd conf/docker-aio`, and run `1prep.sh` to copy files for integration test data into docker build context; `1prep.sh` will also build the war file and installation zip file +- build the docker image: `docker build -t dv0 -f c7.dockerfile .` + +- Run image: `docker run -d -p 8083:8080 --name dv dv0` (aka - forward port 8083 locally to 8080 in the container) + +Note: If you see an error like this... `docker: Error response from daemon: Conflict. The container name "/dv" is already in use by container "5f72a45b68c86c7b0f4305b83ce7d663020329ea4e30fa2a3ce9ddb05223533d". You have to remove (or rename) that container to be able to reuse that name.` ... run something like `docker ps -a | grep dv` to see the container left over from the last run and something like `docker rm 5f72a45b68c8` to remove it. Then try the `docker run` command above again. + +- Installation (integration test): `docker exec -it dv /opt/dv/setupIT.bash` + +(Note that it's possible to customize the installation by editing `conf/docker-aio/default.config` and running `docker exec -it dv /opt/dv/install.bash` but for the purposes of integration testing, the `setupIT.bash` script above works fine.) + +- update `dataverse.siteUrl` (appears only necessary for `DatasetsIT.testPrivateUrl`): `docker exec -it dv /usr/local/glassfish4/bin/asadmin create-jvm-options "-Ddataverse.siteUrl=http\://localhost\:8083"` + +Run integration tests: + +First, cd back to the root of the repo where the `pom.xml` file is (`cd ../..` assuming you're still in the `conf/docker-aio` directory). Then run the test suite with script below: + +`conf/docker-aio/run-test-suite.sh` + +There isn't any strict requirement on the local port (8083 in this doc), the name of the image (dv0) or container (dv), these can be changed as desired as long as they are consistent. + diff --git a/conf/docker-aio/run-test-suite.sh b/conf/docker-aio/run-test-suite.sh new file mode 100755 index 00000000000..6bbf23dee18 --- /dev/null +++ b/conf/docker-aio/run-test-suite.sh @@ -0,0 +1,4 @@ +#!/bin/sh +# This is the canonical list of which "IT" tests are expected to pass. +# Please note the "dataverse.test.baseurl" is set to run for "all-in-one" Docker environment. +mvn test -Dtest=DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT -Ddataverse.test.baseurl='http://localhost:8083' diff --git a/conf/docker-aio/setupIT.bash b/conf/docker-aio/setupIT.bash new file mode 100755 index 00000000000..528b8f3c5f8 --- /dev/null +++ b/conf/docker-aio/setupIT.bash @@ -0,0 +1,13 @@ +#!/usr/bin/env bash + +# do integration-test install and test data setup + +cd /opt/dv +unzip dvinstall.zip +cd /opt/dv/testdata +./scripts/deploy/phoenix.dataverse.org/prep +./db.sh +./install # modified from phoenix +/usr/local/glassfish4/glassfish/bin/asadmin deploy /opt/dv/dvinstall/dataverse.war +./post # modified from phoenix + diff --git a/conf/docker-aio/testscripts/db.sh b/conf/docker-aio/testscripts/db.sh new file mode 100755 index 00000000000..aeb09f0a7de --- /dev/null +++ b/conf/docker-aio/testscripts/db.sh @@ -0,0 +1,3 @@ +#!/bin/sh +psql -U postgres -c "CREATE ROLE dvnapp UNENCRYPTED PASSWORD 'secret' SUPERUSER CREATEDB CREATEROLE INHERIT LOGIN" template1 +psql -U dvnapp -c 'CREATE DATABASE "dvndb" WITH OWNER = "dvnapp"' template1 diff --git a/conf/docker-aio/testscripts/install b/conf/docker-aio/testscripts/install new file mode 100755 index 00000000000..32f3a39807c --- /dev/null +++ b/conf/docker-aio/testscripts/install @@ -0,0 +1,21 @@ +#!/bin/sh +export HOST_ADDRESS=localhost +export GLASSFISH_ROOT=/usr/local/glassfish4 +export FILES_DIR=/usr/local/glassfish4/glassfish/domains/domain1/files +export DB_NAME=dvndb +export DB_PORT=5432 +export DB_HOST=localhost +export DB_USER=dvnapp +export DB_PASS=secret +export RSERVE_HOST=localhost +export RSERVE_PORT=6311 +export RSERVE_USER=rserve +export RSERVE_PASS=rserve +export SMTP_SERVER=localhost +export MEM_HEAP_SIZE=2048 +export GLASSFISH_DOMAIN=domain1 +cd scripts/installer +cp pgdriver/postgresql-8.4-703.jdbc4.jar $GLASSFISH_ROOT/glassfish/lib +#cp ../../conf/jhove/jhove.conf $GLASSFISH_ROOT/glassfish/domains/$GLASSFISH_DOMAIN/config/jhove.conf +cp /opt/dv/testdata/jhove.conf $GLASSFISH_ROOT/glassfish/domains/$GLASSFISH_DOMAIN/config/jhove.conf +./glassfish-setup.sh diff --git a/conf/docker-aio/testscripts/post b/conf/docker-aio/testscripts/post new file mode 100755 index 00000000000..03eaf59fa34 --- /dev/null +++ b/conf/docker-aio/testscripts/post @@ -0,0 +1,15 @@ +#/bin/sh +cd scripts/api +./setup-all.sh --insecure | tee /tmp/setup-all.sh.out +cd ../.. +psql -U dvnapp dvndb -f scripts/database/reference_data.sql +psql -U dvnapp dvndb -f doc/sphinx-guides/source/_static/util/pg8-createsequence-prep.sql +psql -U dvnapp dvndb -f doc/sphinx-guides/source/_static/util/createsequence.sql +scripts/search/tests/publish-dataverse-root +#git checkout scripts/api/data/dv-root.json +scripts/search/tests/grant-authusers-add-on-root +scripts/search/populate-users +scripts/search/create-users +scripts/search/tests/create-all-and-test +scripts/search/tests/publish-spruce1-and-test +#java -jar downloads/schemaSpy_5.0.0.jar -t pgsql -host localhost -db dvndb -u postgres -p secret -s public -dp scripts/installer/pgdriver/postgresql-9.1-902.jdbc4.jar -o /var/www/html/schemaspy/latest diff --git a/conf/docker/build.sh b/conf/docker/build.sh new file mode 100755 index 00000000000..a4828ba607f --- /dev/null +++ b/conf/docker/build.sh @@ -0,0 +1,42 @@ +#!/bin/sh +# Creates images and pushes them to Docker Hub. +# The "kick-the-tires" tag should be relatively stable. No breaking changes. +# Push to custom tags or tags based on branch names to iterate on the images. +if [ -z "$1" ]; then + echo "No argument supplied. Please specify \"branch\" or \"custom my-custom-tag\" for experiments or \"stable\" if your change won't break anything." + exit 1 +fi + +if [ "$1" == 'branch' ]; then + echo "We'll push a tag to the branch you're on." + GIT_BRANCH=$(git rev-parse --abbrev-ref HEAD) + TAG=$GIT_BRANCH +elif [ "$1" == 'stable' ]; then + echo "We'll push a tag to the most stable tag (which isn't saying much!)." + TAG=kick-the-tires +elif [ "$1" == 'custom' ]; then + if [ -z "$1" ]; then + echo "You must provide a custom tag as the second argument." + exit 1 + else + echo "We'll push a custom tag." + TAG=$2 + fi +else + echo "Unexpected argument: $1. Exiting. Run with no arguments for help." + exit 1 +fi +echo Images will be pushed to Docker Hub with the tag \"$TAG\". +# Use "conf" directory as context so we can copy schema.xml into Solr image. +docker build -t iqss/dataverse-solr:$TAG -f solr/Dockerfile ../../conf +docker push iqss/dataverse-solr:$TAG +# TODO: Think about if we really need dataverse.war because it's in dvinstall.zip. +# FIXME: Automate the building of dataverse.war and dvinstall.zip. Think about https://github.com/IQSS/dataverse/issues/3974 and https://github.com/IQSS/dataverse/pull/3975 +cp ../../target/dataverse*.war dataverse-glassfish/dataverse.war +cp ../../scripts/installer/dvinstall.zip dataverse-glassfish +cp ../../doc/sphinx-guides/source/_static/util/default.config dataverse-glassfish +cp ../../downloads/glassfish-4.1.zip dataverse-glassfish +cp ../../downloads/weld-osgi-bundle-2.2.10.Final-glassfish4.jar dataverse-glassfish +docker build -t iqss/dataverse-glassfish:$TAG dataverse-glassfish +# FIXME: Check the output of `docker build` and only push on success. +docker push iqss/dataverse-glassfish:$TAG diff --git a/conf/docker/dataverse-glassfish/.gitignore b/conf/docker/dataverse-glassfish/.gitignore new file mode 100644 index 00000000000..b0e6e38894f --- /dev/null +++ b/conf/docker/dataverse-glassfish/.gitignore @@ -0,0 +1,5 @@ +glassfish-4.1.zip +weld-osgi-bundle-2.2.10.Final-glassfish4.jar +dvinstall.zip +dataverse.war +default.config diff --git a/conf/docker/dataverse-glassfish/Dockerfile b/conf/docker/dataverse-glassfish/Dockerfile new file mode 100644 index 00000000000..939ce98fb72 --- /dev/null +++ b/conf/docker/dataverse-glassfish/Dockerfile @@ -0,0 +1,98 @@ +FROM centos:7.2.1511 +MAINTAINER Dataverse (support@dataverse.org) + +COPY glassfish-4.1.zip /tmp +COPY weld-osgi-bundle-2.2.10.Final-glassfish4.jar /tmp +COPY default.config /tmp +COPY dvinstall.zip /tmp + +# Install dependencies +#RUN yum install -y unzip +RUN yum install -y \ + cronie \ + git \ + java-1.8.0-openjdk-devel \ + nc \ + perl \ + postgresql \ + sha1sum \ + unzip \ + wget + +ENV GLASSFISH_DOWNLOAD_SHA1 d1a103d06682eb08722fbc9a93089211befaa080 +ENV GLASSFISH_DIRECTORY "/usr/local/glassfish4" +ENV HOST_DNS_ADDRESS "localhost" +ENV POSTGRES_DB "dvndb" +ENV POSTGRES_USER "dvnapp" +ENV RSERVE_USER "rserve" +ENV RSERVE_PASSWORD "rserve" + +#RUN ls /tmp +# +RUN find /tmp +# +#RUN exitEarly + +# Install Glassfish 4.1 + +RUN cd /tmp \ + && unzip glassfish-4.1.zip \ + && mv glassfish4 /usr/local \ + && cd /usr/local/glassfish4/glassfish/modules \ + && rm weld-osgi-bundle.jar \ + && cp /tmp/weld-osgi-bundle-2.2.10.Final-glassfish4.jar . \ + #FIXME: Patch Grizzly too! + && echo "Done installing and patching Glassfish" + +RUN chmod g=u /etc/passwd + +RUN mkdir -p /home/glassfish +RUN chgrp -R 0 /home/glassfish && \ + chmod -R g=u /home/glassfish + +RUN mkdir -p /usr/local/glassfish4 +RUN chgrp -R 0 /usr/local/glassfish4 && \ + chmod -R g=u /usr/local/glassfish4 + + +#RUN exitEarlyBeforeJq +RUN yum -y install epel-release +RUN yum install -y jq + +# Install jq +#RUN cd /tmp \ +# && wget https://github.com/stedolan/jq/releases/download/jq-1.5/jq-linux64 \ +# && mv jq-linux64 /usr/local/bin \ +# && chmod +x /usr/local/bin/jq-linux64 \ +# && ln -s /usr/local/bin/jq-linux64 /usr/local/bin/jq + +# Customized persistence xml to avoid database recreation +#RUN mkdir -p /tmp/WEB-INF/classes/META-INF/ +#COPY WEB-INF/classes/META-INF/persistence.xml /tmp/WEB-INF/classes/META-INF/ + +# Install iRods iCommands +#RUN cd /tmp \ +# && yum -y install epel-release \ +# && yum -y install ftp://ftp.renci.org/pub/irods/releases/4.1.6/centos7/irods-icommands-4.1.6-centos7-x86_64.rpm + +#COPY config-glassfish /root/dvinstall +#COPY restart-glassfish /root/dvinstall +#COPY config-dataverse /root/dvinstall + +#RUN cd /root/dvinstall && ./config-dataverse + +COPY ./entrypoint.sh / +#COPY ./ddl /root/dvinstall +#COPY ./init-postgres /root/dvinstall +#COPY ./init-glassfish /root/dvinstall +#COPY ./init-dataverse /root/dvinstall +#COPY ./setup-all.sh /root/dvinstall +#COPY ./setup-irods.sh /root/dvinstall +COPY ./Dockerfile / + +VOLUME /usr/local/glassfish4/glassfish/domains/domain1/files + +EXPOSE 8080 + +ENTRYPOINT ["/entrypoint.sh"] +CMD ["dataverse"] diff --git a/conf/docker/dataverse-glassfish/entrypoint.sh b/conf/docker/dataverse-glassfish/entrypoint.sh new file mode 100755 index 00000000000..bc1b7eb3f93 --- /dev/null +++ b/conf/docker/dataverse-glassfish/entrypoint.sh @@ -0,0 +1,135 @@ +#!/bin/bash -x + +# Entrypoint script for Dataverse web application. This script waits +# for dependent services (Rserve, Postgres, Solr) to start before +# initializing Glassfish. + +echo "whoami before..." +whoami +if ! whoami &> /dev/null; then + if [ -w /etc/passwd ]; then + # Make `whoami` return the glassfish user. # See https://docs.openshift.org/3.6/creating_images/guidelines.html#openshift-origin-specific-guidelines + # Fancy bash magic from https://github.com/RHsyseng/container-rhel-examples/blob/1208dcd7d4f431fc6598184dba6341b9465f4197/starter-arbitrary-uid/bin/uid_entrypoint#L4 + echo "${USER_NAME:-glassfish}:x:$(id -u):0:${USER_NAME:-glassfish} user:/home/glassfish:/bin/bash" >> /etc/passwd + fi +fi +echo "whoami after" +whoami + +set -e + +if [ "$1" = 'dataverse' ]; then + + export GLASSFISH_DIRECTORY=/usr/local/glassfish4 + export HOST_DNS_ADDRESS=localhost + + TIMEOUT=30 + + if [ -n "$RSERVE_SERVICE_HOST" ]; then + RSERVE_HOST=$RSERVE_SERVICE_HOST + elif [ -n "$RSERVE_PORT_6311_TCP_ADDR" ]; then + RSERVE_HOST=$RSERVE_PORT_6311_TCP_ADDR + elif [ -z "$RSERVE_HOST" ]; then + RSERVE_HOST="localhost" + fi + export RSERVE_HOST + + if [ -n "$RSERVE_SERVICE_PORT" ]; then + RSERVE_PORT=$RSERVE_SERVICE_PORT + elif [ -n "$RSERVE_PORT_6311_TCP_PORT" ]; then + RSERVE_PORT=$RSERVE_PORT_6311_TCP_PORT + elif [ -z "$RSERVE_PORT" ]; then + RSERVE_PORT="6311" + fi + export RSERVE_PORT + + echo "Using Rserve at $RSERVE_HOST:$RSERVE_PORT" + + if ncat $RSERVE_HOST $RSERVE_PORT -w $TIMEOUT --send-only < /dev/null > /dev/null 2>&1 ; then + echo Rserve running; + else + echo Optional service Rserve not running. + fi + + + # postgres + if [ -n "$POSTGRES_SERVICE_HOST" ]; then + POSTGRES_HOST=$POSTGRES_SERVICE_HOST + elif [ -n "$POSTGRES_PORT_5432_TCP_ADDR" ]; then + POSTGRES_HOST=$POSTGRES_PORT_5432_TCP_ADDR + elif [ -z "$POSTGRES_HOST" ]; then + POSTGRES_HOST="localhost" + fi + export POSTGRES_HOST + + if [ -n "$POSTGRES_SERVICE_PORT" ]; then + POSTGRES_PORT=$POSTGRES_SERVICE_PORT + elif [ -n "$POSTGRES_PORT_5432_TCP_PORT" ]; then + POSTGRES_PORT=$POSTGRES_PORT_5432_TCP_PORT + else + POSTGRES_PORT=5432 + fi + export POSTGRES_PORT + + echo "Using Postgres at $POSTGRES_HOST:$POSTGRES_PORT" + + if ncat $POSTGRES_HOST $POSTGRES_PORT -w $TIMEOUT --send-only < /dev/null > /dev/null 2>&1 ; then + echo Postgres running; + else + echo Required service Postgres not running. Have you started the required services? + exit 1 + fi + + # solr + if [ -n "$SOLR_SERVICE_HOST" ]; then + SOLR_HOST=$SOLR_SERVICE_HOST + elif [ -n "$SOLR_PORT_8983_TCP_ADDR" ]; then + SOLR_HOST=$SOLR_PORT_8983_TCP_ADDR + elif [ -z "$SOLR_HOST" ]; then + SOLR_HOST="localhost" + fi + export SOLR_HOST + + if [ -n "$SOLR_SERVICE_PORT" ]; then + SOLR_PORT=$SOLR_SERVICE_PORT + elif [ -n "$SOLR_PORT_8983_TCP_PORT" ]; then + SOLR_PORT=$SOLR_PORT_8983_TCP_PORT + else + SOLR_PORT=8983 + fi + export SOLR_PORT + + echo "Using Solr at $SOLR_HOST:$SOLR_PORT" + + if ncat $SOLR_HOST $SOLR_PORT -w $TIMEOUT --send-only < /dev/null > /dev/null 2>&1 ; then + echo Solr running; + else + echo Required service Solr not running. Have you started the required services? + exit 1 + fi + + GLASSFISH_INSTALL_DIR="/usr/local/glassfish4" + cd $GLASSFISH_INSTALL_DIR + cp /tmp/dvinstall.zip $GLASSFISH_INSTALL_DIR + unzip dvinstall.zip + cd dvinstall + echo Copying the non-interactive file into place + cp /tmp/default.config . + echo Looking at first few lines of default.config + head default.config + # non-interactive install + echo Running non-interactive install + #./install -y -f > install.out 2> install.err + ./install -y -f + +# if [ -n "$DVICAT_PORT_1247_TCP_PORT" ]; then +# ./setup-irods.sh +# fi + + echo -e "\n\nDataverse started" + + sleep infinity +else + exec "$@" +fi + diff --git a/conf/docker/postgresql/Dockerfile b/conf/docker/postgresql/Dockerfile new file mode 100644 index 00000000000..81ecf0fdeb8 --- /dev/null +++ b/conf/docker/postgresql/Dockerfile @@ -0,0 +1,3 @@ +# PostgreSQL for Dataverse (but consider switching to the image from CentOS) +# +# See also conf/docker/dataverse-glassfish/Dockerfile diff --git a/conf/docker/solr/Dockerfile b/conf/docker/solr/Dockerfile new file mode 100644 index 00000000000..99114ce6a6d --- /dev/null +++ b/conf/docker/solr/Dockerfile @@ -0,0 +1,28 @@ +FROM centos:7.2.1511 +MAINTAINER Dataverse (support@dataverse.org) + +RUN yum install -y wget unzip perl git java-1.8.0-openjdk-devel postgresql.x86_64 + +# Install Solr 4.6.0 +# The context of the build is the "conf" directory. +COPY solr/4.6.0/schema.xml /tmp + +RUN cd /tmp && wget https://archive.apache.org/dist/lucene/solr/4.6.0/solr-4.6.0.tgz && \ + tar xvzf solr-4.6.0.tgz && \ + mv solr-4.6.0 /usr/local/ && \ + cd /usr/local/solr-4.6.0/example/solr/collection1/conf/ && \ + mv schema.xml schema.xml.backup && \ + cp /tmp/schema.xml . && \ + rm /tmp/solr-4.6.0.tgz + +RUN ln -s /usr/local/solr-4.6.0/example/logs /var/log/solr + +VOLUME /usr/local/solr-4.6.0/example/solr/collection1/data + +EXPOSE 8983 + +COPY docker/solr/Dockerfile /Dockerfile +COPY docker/solr/entrypoint.sh / + +ENTRYPOINT ["/entrypoint.sh"] +CMD ["solr"] diff --git a/conf/docker/solr/entrypoint.sh b/conf/docker/solr/entrypoint.sh new file mode 100755 index 00000000000..7fd8d6380c2 --- /dev/null +++ b/conf/docker/solr/entrypoint.sh @@ -0,0 +1,10 @@ +#!/bin/bash + +if [ "$1" = 'solr' ]; then + cd /usr/local/solr-4.6.0/example/ + java -jar start.jar +elif [ "$1" = 'usage' ]; then + echo 'docker run -d iqss/dataverse-solr solr' +else + exec "$@" +fi diff --git a/conf/openshift/openshift.json b/conf/openshift/openshift.json new file mode 100644 index 00000000000..330fc8914ae --- /dev/null +++ b/conf/openshift/openshift.json @@ -0,0 +1,392 @@ +{ + "kind": "Template", + "apiVersion": "v1", + "metadata": { + "name": "dataverse", + "labels": { + "name": "dataverse" + }, + "annotations": { + "openshift.io/description": "Dataverse is open source research data repository software: https://dataverse.org", + "openshift.io/display-name": "Dataverse" + } + }, + "objects": [ + { + "kind": "Service", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-glassfish-service" + }, + "spec": { + "selector": { + "name": "iqss-dataverse-glassfish" + }, + "ports": [ + { + "name": "web", + "protocol": "TCP", + "port": 8080, + "targetPort": 8080 + } + ] + } + }, + { + "kind": "Service", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-postgresql-service" + }, + "spec": { + "selector": { + "name": "iqss-dataverse-postgresql" + }, + "ports": [ + { + "name": "database", + "protocol": "TCP", + "port": 5432, + "targetPort": 5432 + } + ] + } + }, + { + "kind": "Service", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-solr-service" + }, + "spec": { + "selector": { + "name": "iqss-dataverse-solr" + }, + "ports": [ + { + "name": "search", + "protocol": "TCP", + "port": 8983, + "targetPort": 8983 + } + ] + } + }, + { + "apiVersion": "v1", + "kind": "Route", + "metadata": { + "annotations": { + "openshift.io/host.generated": "true" + }, + "name": "dataverse" + }, + "spec": { + "port": { + "targetPort": "web" + }, + "to": { + "kind": "Service", + "name": "dataverse-glassfish-service", + "weight": 100 + } + } + }, + { + "kind": "ImageStream", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-plus-glassfish" + }, + "spec": { + "dockerImageRepository": "iqss/dataverse-glassfish" + } + }, + { + "kind": "ImageStream", + "apiVersion": "v1", + "metadata": { + "name": "centos-postgresql-94-centos7" + }, + "spec": { + "dockerImageRepository": "centos/postgresql-94-centos7" + } + }, + { + "kind": "ImageStream", + "apiVersion": "v1", + "metadata": { + "name": "iqss-dataverse-solr" + }, + "spec": { + "dockerImageRepository": "iqss/dataverse-solr" + } + }, + { + "kind": "DeploymentConfig", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-glassfish", + "annotations": { + "template.alpha.openshift.io/wait-for-ready": "true" + } + }, + "spec": { + "template": { + "metadata": { + "labels": { + "name": "iqss-dataverse-glassfish" + } + }, + "spec": { + "containers": [ + { + "name": "dataverse-plus-glassfish", + "image": "dataverse-plus-glassfish", + "ports": [ + { + "containerPort": 8080, + "protocol": "TCP" + } + ], + "env": [ + { + "name": "POSTGRES_SERVICE_HOST", + "value": "dataverse-postgresql-service" + }, + { + "name": "SOLR_SERVICE_HOST", + "value": "dataverse-solr-service" + }, + { + "name": "ADMIN_PASSWORD", + "value": "admin" + }, + { + "name": "SMTP_HOST", + "value": "localhost" + }, + { + "name": "POSTGRES_USER", + "value": "dvnapp" + }, + { + "name": "POSTGRES_PASSWORD", + "value": "secret" + }, + { + "name": "POSTGRES_DATABASE", + "value": "dvndb" + } + ], + "imagePullPolicy": "IfNotPresent", + "securityContext": { + "capabilities": {}, + "privileged": false + } + } + ] + } + }, + "strategy": { + "type": "Rolling", + "rollingParams": { + "updatePeriodSeconds": 1, + "intervalSeconds": 1, + "timeoutSeconds": 300 + }, + "resources": {} + }, + "triggers": [ + { + "type": "ImageChange", + "imageChangeParams": { + "automatic": true, + "containerNames": [ + "dataverse-plus-glassfish" + ], + "from": { + "kind": "ImageStreamTag", + "name": "dataverse-plus-glassfish:latest" + } + } + }, + { + "type": "ConfigChange" + } + ], + "replicas": 1, + "selector": { + "name": "iqss-dataverse-glassfish" + } + } + }, + { + "kind": "DeploymentConfig", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-postgresql", + "annotations": { + "template.alpha.openshift.io/wait-for-ready": "true" + } + }, + "spec": { + "template": { + "metadata": { + "labels": { + "name": "iqss-dataverse-postgresql" + } + }, + "spec": { + "containers": [ + { + "name": "centos-postgresql-94-centos7", + "image": "centos-postgresql-94-centos7", + "ports": [ + { + "containerPort": 5432, + "protocol": "TCP" + } + ], + "env": [ + { + "name": "POSTGRESQL_USER", + "value": "dvnapp" + }, + { + "name": "POSTGRESQL_PASSWORD", + "value": "secret" + }, + { + "name": "POSTGRESQL_DATABASE", + "value": "dvndb" + }, + { + "name": "POSTGRESQL_ADMIN_PASSWORD", + "value": "secret" + } + + ], + "resources": { + "limits": { + "memory": "256Mi" + } + }, + "imagePullPolicy": "IfNotPresent", + "securityContext": { + "capabilities": {}, + "privileged": false + } + } + ] + } + }, + "strategy": { + "type": "Rolling", + "rollingParams": { + "updatePeriodSeconds": 1, + "intervalSeconds": 1, + "timeoutSeconds": 300 + }, + "resources": {} + }, + "triggers": [ + { + "type": "ImageChange", + "imageChangeParams": { + "automatic": true, + "containerNames": [ + "centos-postgresql-94-centos7" + ], + "from": { + "kind": "ImageStreamTag", + "name": "centos-postgresql-94-centos7:latest" + } + } + }, + { + "type": "ConfigChange" + } + ], + "replicas": 1, + "selector": { + "name": "iqss-dataverse-postgresql" + } + } + }, + { + "kind": "DeploymentConfig", + "apiVersion": "v1", + "metadata": { + "name": "dataverse-solr", + "annotations": { + "template.alpha.openshift.io/wait-for-ready": "true" + } + }, + "spec": { + "template": { + "metadata": { + "labels": { + "name": "iqss-dataverse-solr" + } + }, + "spec": { + "containers": [ + { + "name": "iqss-dataverse-solr", + "image": "iqss-dataverse-solr", + "ports": [ + { + "containerPort": 8983, + "protocol": "TCP" + } + ], + "resources": { + "limits": { + "memory": "256Mi" + } + }, + "imagePullPolicy": "IfNotPresent", + "securityContext": { + "capabilities": {}, + "privileged": false + } + } + ] + } + }, + "strategy": { + "type": "Rolling", + "rollingParams": { + "updatePeriodSeconds": 1, + "intervalSeconds": 1, + "timeoutSeconds": 300 + }, + "resources": {} + }, + "triggers": [ + { + "type": "ImageChange", + "imageChangeParams": { + "automatic": true, + "containerNames": [ + "iqss-dataverse-solr" + ], + "from": { + "kind": "ImageStreamTag", + "name": "iqss-dataverse-solr:latest" + } + } + }, + { + "type": "ConfigChange" + } + ], + "replicas": 1, + "selector": { + "name": "iqss-dataverse-solr" + } + } + } + ] +} diff --git a/doc/sphinx-guides/source/_static/admin/ipGroupAll.json b/doc/sphinx-guides/source/_static/admin/ipGroupAll.json new file mode 100644 index 00000000000..567258e69ee --- /dev/null +++ b/doc/sphinx-guides/source/_static/admin/ipGroupAll.json @@ -0,0 +1,14 @@ +{ + "alias": "ipGroupAll", + "name": "IP group to match all IPv4 and IPv6 addresses", + "ranges": [ + [ + "0.0.0.0", + "255.255.255.255" + ], + [ + "::", + "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff" + ] + ] +} diff --git a/doc/sphinx-guides/source/_static/api/dataset-update-metadata.json b/doc/sphinx-guides/source/_static/api/dataset-update-metadata.json new file mode 100644 index 00000000000..6e499d4e164 --- /dev/null +++ b/doc/sphinx-guides/source/_static/api/dataset-update-metadata.json @@ -0,0 +1,86 @@ +{ + "metadataBlocks": { + "citation": { + "displayName": "Citation Metadata", + "fields": [ + { + "typeName": "title", + "multiple": false, + "typeClass": "primitive", + "value": "newTitle" + }, + { + "typeName": "author", + "multiple": true, + "typeClass": "compound", + "value": [ + { + "authorName": { + "typeName": "authorName", + "multiple": false, + "typeClass": "primitive", + "value": "Spruce, Sabrina" + } + } + ] + }, + { + "typeName": "datasetContact", + "multiple": true, + "typeClass": "compound", + "value": [ + { + "datasetContactName": { + "typeName": "datasetContactName", + "multiple": false, + "typeClass": "primitive", + "value": "Spruce, Sabrina" + }, + "datasetContactEmail": { + "typeName": "datasetContactEmail", + "multiple": false, + "typeClass": "primitive", + "value": "spruce@mailinator.com" + } + } + ] + }, + { + "typeName": "dsDescription", + "multiple": true, + "typeClass": "compound", + "value": [ + { + "dsDescriptionValue": { + "typeName": "dsDescriptionValue", + "multiple": false, + "typeClass": "primitive", + "value": "test" + } + } + ] + }, + { + "typeName": "subject", + "multiple": true, + "typeClass": "controlledVocabulary", + "value": [ + "Other" + ] + }, + { + "typeName": "depositor", + "multiple": false, + "typeClass": "primitive", + "value": "Spruce, Sabrina" + }, + { + "typeName": "dateOfDeposit", + "multiple": false, + "typeClass": "primitive", + "value": "2017-04-19" + } + ] + } + } +} diff --git a/doc/sphinx-guides/source/_static/installation/files/etc/systemd/solr.service b/doc/sphinx-guides/source/_static/installation/files/etc/systemd/solr.service new file mode 100644 index 00000000000..46fb6ea8e2a --- /dev/null +++ b/doc/sphinx-guides/source/_static/installation/files/etc/systemd/solr.service @@ -0,0 +1,13 @@ +[Unit] +Description = Apache Solr +After = syslog.target network.target remote-fs.target nss-lookup.target + +[Service] +User = solr +Type = simple +WorkingDirectory = /usr/local/solr/example +ExecStart = /usr/bin/java -jar -server /usr/local/solr/example/start.jar +Restart=on-failure + +[Install] +WantedBy = multi-user.target diff --git a/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid-sandbox.json b/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid-sandbox.json index 61bf7b79c82..f5c83829f50 100644 --- a/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid-sandbox.json +++ b/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid-sandbox.json @@ -3,6 +3,6 @@ "factoryAlias":"oauth2", "title":"ORCID", "subtitle":"", - "factoryData":"type: orcid | userEndpoint: https://api.sandbox.orcid.org/v1.2/{ORCID}/orcid-profile | clientId: FIXME | clientSecret: FIXME", + "factoryData":"type: orcid | userEndpoint: https://api.sandbox.orcid.org/v2.0/{ORCID}/person | clientId: FIXME | clientSecret: FIXME", "enabled":true } diff --git a/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid.json b/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid.json index 0615bacb056..3b974a3fbc4 100644 --- a/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid.json +++ b/doc/sphinx-guides/source/_static/installation/files/root/auth-providers/orcid.json @@ -3,6 +3,6 @@ "factoryAlias":"oauth2", "title":"ORCID", "subtitle":"", - "factoryData":"type: orcid | userEndpoint: https://api.orcid.org/v1.2/{ORCID}/orcid-profile | clientId: FIXME | clientSecret: FIXME", + "factoryData":"type: orcid | userEndpoint: https://api.orcid.org/v2.0/{ORCID}/person | clientId: FIXME | clientSecret: FIXME", "enabled":true } diff --git a/doc/sphinx-guides/source/_static/installation/files/root/external-tools/awesomeTool.json b/doc/sphinx-guides/source/_static/installation/files/root/external-tools/awesomeTool.json new file mode 100644 index 00000000000..7f1e801be82 --- /dev/null +++ b/doc/sphinx-guides/source/_static/installation/files/root/external-tools/awesomeTool.json @@ -0,0 +1,16 @@ +{ + "displayName": "Awesome Tool", + "description": "The most awesome tool.", + "type": "explore", + "toolUrl": "https://awesometool.com", + "toolParameters": { + "queryParameters": [ + { + "fileid": "{fileId}" + }, + { + "key": "{apiToken}" + } + ] + } +} diff --git a/doc/sphinx-guides/source/_static/installation/files/root/external-tools/twoRavens.json b/doc/sphinx-guides/source/_static/installation/files/root/external-tools/twoRavens.json new file mode 100644 index 00000000000..dad65ccc902 --- /dev/null +++ b/doc/sphinx-guides/source/_static/installation/files/root/external-tools/twoRavens.json @@ -0,0 +1,16 @@ +{ + "displayName": "TwoRavens", + "description": "A system of interlocking statistical tools for data exploration, analysis, and meta-analysis.", + "type": "explore", + "toolUrl": "https://tworavens.dataverse.example.edu/dataexplore/gui.html", + "toolParameters": { + "queryParameters": [ + { + "dfId": "{fileId}" + }, + { + "key": "{apiToken}" + } + ] + } +} diff --git a/doc/sphinx-guides/source/_static/util/check_timer.bash b/doc/sphinx-guides/source/_static/util/check_timer.bash new file mode 100755 index 00000000000..e75ea686496 --- /dev/null +++ b/doc/sphinx-guides/source/_static/util/check_timer.bash @@ -0,0 +1,18 @@ +#!/usr/bin/env bash + +# example monitoring script for EBJ timers. +# currently assumes that there are two timers +# real monitoring commands should replace the echo statements for production use + +r0=`curl -s http://localhost:8080/ejb-timer-service-app/timer` + +if [ $? -ne 0 ]; then + echo "alert - no timer service" # put real alert command here +fi + +r1=`echo $r0 | grep -c "There are 2 active persistent timers on this container"` + +if [ "1" -ne "$r1" ]; then + echo "alert - no active timers" # put real alert command here +fi + diff --git a/doc/sphinx-guides/source/_static/util/createsequence.sql b/doc/sphinx-guides/source/_static/util/createsequence.sql index 2af1e06d45d..2677832abd8 100644 --- a/doc/sphinx-guides/source/_static/util/createsequence.sql +++ b/doc/sphinx-guides/source/_static/util/createsequence.sql @@ -19,7 +19,7 @@ CACHE 1; ALTER TABLE datasetidentifier_seq OWNER TO "dvnapp"; --- And now create a PostgresQL FUNCTION, for JPA to +-- And now create a PostgreSQL FUNCTION, for JPA to -- access as a NamedStoredProcedure: CREATE OR REPLACE FUNCTION generateIdentifierAsSequentialNumber( diff --git a/doc/sphinx-guides/source/_static/util/default.config b/doc/sphinx-guides/source/_static/util/default.config index 8f95e1dd3a3..3fb3d98472c 100644 --- a/doc/sphinx-guides/source/_static/util/default.config +++ b/doc/sphinx-guides/source/_static/util/default.config @@ -8,7 +8,7 @@ POSTGRES_PORT 5432 POSTGRES_DATABASE dvndb POSTGRES_USER dvnapp POSTGRES_PASSWORD secret -SOLR_LOCATION LOCAL +SOLR_LOCATION localhost:8983 TWORAVENS_LOCATION NOT INSTALLED RSERVE_HOST localhost RSERVE_PORT 6311 diff --git a/doc/sphinx-guides/source/admin/backups.rst b/doc/sphinx-guides/source/admin/backups.rst new file mode 100644 index 00000000000..e11aa84ebfc --- /dev/null +++ b/doc/sphinx-guides/source/admin/backups.rst @@ -0,0 +1,11 @@ +Backups +======= + +.. contents:: Contents: + :local: + +Running tape, or similar backups to ensure the long term preservation of all the data stored in the Dataverse is an implied responsibility that should be taken most seriously. + +*In addition* to running these disk-level backups, we have provided an experimental script that can be run on schedule (via a cron job or something similar) to create extra archival copies of all the Datafiles stored in the Dataverse on a remote storage server, accessible via an ssh connection. The script and some documentation can be found in ``scripts/backup/run_backup`` in the Dataverse source tree at https://github.com/IQSS/dataverse . Some degree of knowledge of system administration and Python is required. + +Once again, the script is experimental and NOT a replacement of regular and reliable disk backups! diff --git a/doc/sphinx-guides/source/admin/index.rst b/doc/sphinx-guides/source/admin/index.rst index 35296d86a62..92bef1d6336 100755 --- a/doc/sphinx-guides/source/admin/index.rst +++ b/doc/sphinx-guides/source/admin/index.rst @@ -21,6 +21,8 @@ These "superuser" tasks are managed via the new page called the Dashboard. A use geoconnect-worldmap user-administration solr-search-index + ip-groups monitoring maintenance + backups troubleshooting diff --git a/doc/sphinx-guides/source/admin/ip-groups.rst b/doc/sphinx-guides/source/admin/ip-groups.rst new file mode 100644 index 00000000000..642b19878f5 --- /dev/null +++ b/doc/sphinx-guides/source/admin/ip-groups.rst @@ -0,0 +1,43 @@ +IP Groups +========= + +IP Groups can be used to permit download of restricted files by IP addresses rather than people. For example, you may want to allow restricted files to be downloaded by researchers who physically enter a library and make use of the library's network. + +.. contents:: Contents: + :local: + +Listing IP Groups +----------------- + +IP Groups can be listed with the following curl command: + +``curl http://localhost:8080/api/admin/groups/ip`` + +Creating an IP Group +-------------------- + +IP Groups must be expressed as ranges in IPv4 or IPv6 format. For illustrative purposes, here is a example of the entire IPv4 and IPv6 range that you can :download:`download <../_static/admin/ipGroupAll.json>` and edit to have a narrower range to meet your needs. If you need your IP Group to only encompass a single IP address, you must enter that IP address for the "start" and "end" of the range. If you don't use IPv6 addresses, you can delete that section of the JSON. Please note that the "alias" must be unique if you define multiple IP Groups. You should give it a meaningful "name" since both "alias" and "name" will appear and be searchable in the GUI when your users are assigning roles. + +.. literalinclude:: ../_static/admin/ipGroupAll.json + +Let's say you download the example above and edit it to give it a range used by your library, giving it a filename of ``ipGroup1.json`` and putting it in the ``/tmp`` directory. Next, load it into Dataverse using the following curl command: + +``curl -X POST -H 'Content-type: application/json' http://localhost:8080/api/admin/groups/ip --upload-file /tmp/ipGroup1.json`` + +Note that you can update a group the same way, as long as you use the same alias. + +Listing an IP Group +-------------------- + +Let's say you used "ipGroup1" as the alias of the IP Group you created above. To list just that IP Group, you can include the alias in the curl command like this: + +``curl http://localhost:8080/api/admin/groups/ip/ipGroup1`` + +Deleting an IP Group +-------------------- + +It is not recommended to delete an IP Group that has been assigned roles. If you want to delete an IP Group, you should first remove its permissions. + +To delete an IP Group with an alias of "ipGroup1", use the curl command below: + +``curl -X DELETE http://localhost:8080/api/admin/groups/ip/ipGroup1`` diff --git a/doc/sphinx-guides/source/admin/metadataexport.rst b/doc/sphinx-guides/source/admin/metadataexport.rst index 8c50ceacd84..c6ebef0ce15 100644 --- a/doc/sphinx-guides/source/admin/metadataexport.rst +++ b/doc/sphinx-guides/source/admin/metadataexport.rst @@ -7,7 +7,12 @@ Metadata Export Automatic Exports ----------------- -Unlike in DVN v3, publishing a dataset in Dataverse 4 automaticalliy starts a metadata export job, that will run in the background, asynchronously. Once completed, it will make the dataset metadata exported and cached in all the supported formats (Dublin Core, Data Documentation Initiative (DDI), and native JSON). There is no need to run the export manually. +Publishing a dataset automatically starts a metadata export job, that will run in the background, asynchronously. Once completed, it will make the dataset metadata exported and cached in all the supported formats: + +- Dublin Core +- Data Documentation Initiative (DDI) +- Schema.org JSON-LD +- native JSON (Dataverse-specific) A scheduled timer job that runs nightly will attempt to export any published datasets that for whatever reason haven't been exported yet. This timer is activated automatically on the deployment, or restart, of the application. So, again, no need to start or configure it manually. (See the "Application Timers" section of this guide for more information) @@ -28,4 +33,4 @@ Note, that creating, modifying, or re-exporting an OAI set will also attempt to Export Failures --------------- -An export batch job, whether started via the API, or by the application timer, will leave a detailed log in your configured logs directory. This is the same location where your main Glassfish server.log is found. The name of the log file is ``export_[timestamp].log`` - for example, *export_2016-08-23T03-35-23.log*. The log will contain the numbers of datasets processed successfully and those for which metadata export failed, with some information on the failures detected. Please attach this log file if you need to contact Dataverse support about metadata export problems. \ No newline at end of file +An export batch job, whether started via the API, or by the application timer, will leave a detailed log in your configured logs directory. This is the same location where your main Glassfish server.log is found. The name of the log file is ``export_[timestamp].log`` - for example, *export_2016-08-23T03-35-23.log*. The log will contain the numbers of datasets processed successfully and those for which metadata export failed, with some information on the failures detected. Please attach this log file if you need to contact Dataverse support about metadata export problems. diff --git a/doc/sphinx-guides/source/admin/monitoring.rst b/doc/sphinx-guides/source/admin/monitoring.rst index 5e2eb95abca..aa5131d1e8a 100644 --- a/doc/sphinx-guides/source/admin/monitoring.rst +++ b/doc/sphinx-guides/source/admin/monitoring.rst @@ -1,11 +1,107 @@ Monitoring =========== +Once you're in production, you'll want to set up some monitoring. This page may serve as a starting point for you but you are encouraged to share your ideas with the Dataverse community! + .. contents:: Contents: :local: -In production you'll want to monitor the usual suspects such as CPU, memory, free disk space, etc. +Operating System Monitoring +--------------------------- + +In production you'll want to monitor the usual suspects such as CPU, memory, free disk space, etc. There are a variety of tools in this space but we'll highlight Munin below because it's relatively easy to set up. + +Munin ++++++ + +http://munin-monitoring.org says, "A default installation provides a lot of graphs with almost no work." From RHEL or CentOS 7, you can try the following steps. + +Enable the EPEL yum repo (if you haven't already): + +``yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm`` + +Install Munin: + +``yum install munin`` + +Start Munin: + +``systemctl start munin-node.service`` + +Configure Munin to start at boot: + +``systemctl enable munin-node.service`` + +Create a username/password (i.e. "admin" for both): + +``htpasswd /etc/munin/munin-htpasswd admin`` + +Assuming you are fronting Glassfish with Apache, prevent Apache from proxying "/munin" traffic to Glassfish by adding the following line to your Apache config: + +``ProxyPassMatch ^/munin !`` + +Then reload Apache to pick up the config change: + +``systemctl reload httpd.service`` + +Test auth for the web interface: + +``curl http://localhost/munin/ -u admin:admin`` + +At this point, graphs should start being generated for disk, network, processes, system, etc. + +HTTP Traffic +------------ -https://github.com/IQSS/dataverse/issues/2595 contains some information on enabling monitoring of Glassfish, which is disabled by default. +HTTP traffic can be monitored from the client side, the server side, or both. + +Monitoring HTTP Traffic from the Client Side +++++++++++++++++++++++++++++++++++++++++++++ + +HTTP traffic for web clients that have cookies enabled (most browsers) can be tracked by Google Analytics and Piwik (renamed to "Matomo") as explained in the :doc:`/installation/config` section of the Installation Guide under ``:GoogleAnalyticsCode`` and ``:PiwikAnalyticsId``, respectively. You could also embed additional client side monitoring solutions by using a custom footer (``:FooterCustomizationFile``), which is described on the same page. + +Monitoring HTTP Traffic from the Server Side ++++++++++++++++++++++++++++++++++++++++++++++ + +There are a wide variety of solutions available for monitoring HTTP traffic from the server side. The following are merely suggestions and a pull request against what is written here to add additional ideas is certainly welcome! Are you excited about the ELK stack (Elasticsearch, Logstash, and Kibana)? The TICK stack (Telegraph InfluxDB Chronograph and Kapacitor)? GoAccess? Prometheus? Graphite? Splunk? Please consider sharing your work with the Dataverse community! + +AWStats ++++++++ + +AWStats is a venerable tool for monitoring web traffic based on Apache access logs. On RHEL/CentOS 7, you can try the following steps. + +Enable the EPEL yum repo: + +``yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm`` + +Install AWStats: + +``yum install awstats`` + +Assuming you are using HTTPS rather than HTTP (and you should!), edit ``/etc/awstats/awstats.standalone.conf`` and change ``LogFile="/var/log/httpd/access_log"`` to ``LogFile="/var/log/httpd/ssl_access_log"``. In the same file, change ``LogFormat=1`` to ``LogFormat=4``. Make both of these changes (``LogFile`` and ``LogFormat`` in ``/etc/awstats/awstats.localhost.localdomain.conf`` as well. + +Process the logs: + +``/usr/share/awstats/tools/awstats_updateall.pl now`` + +Please note that load balancers (such as Amazon's ELB) might interfer with the ``LogFormat`` mentioned above. To start troubleshooting errors such as ``AWStats did not find any valid log lines that match your LogFormat parameter``, you might need to bump up the value of ``NbOfLinesForCorruptedLog`` in the config files above and re-try while you interate on your Apache and AWStats config. + +Please note that the Dataverse team has attempted to parse Glassfish logs using AWStats but it didn't seem to just work and posts have been made at https://stackoverflow.com/questions/49134154/what-logformat-definition-does-awstats-require-to-parse-glassfish-http-access-logs and https://sourceforge.net/p/awstats/discussion/43428/thread/9b1befda/ that can be followed up on some day. + +Database Connection Pool used by Glassfish +------------------------------------------ + +https://github.com/IQSS/dataverse/issues/2595 contains some information on enabling monitoring of Glassfish, which is disabled by default. It's a TODO to document what to do here if there is sufficient interest. + + +actionlogrecord +--------------- There is a database table called ``actionlogrecord`` that captures events that may be of interest. See https://github.com/IQSS/dataverse/issues/2729 for more discussion around this table. + +EJB Timers +---------- + +Should you be interested in monitoring the EJB timers, this script may be used as an example: + +.. literalinclude:: ../_static/util/check_timer.bash diff --git a/doc/sphinx-guides/source/admin/timers.rst b/doc/sphinx-guides/source/admin/timers.rst index f118604654b..3c1ff40f935 100644 --- a/doc/sphinx-guides/source/admin/timers.rst +++ b/doc/sphinx-guides/source/admin/timers.rst @@ -24,21 +24,23 @@ The following JVM option instructs the application to act as the dedicated timer **IMPORTANT:** Note that this option is automatically set by the Dataverse installer script. That means that when **configuring a multi-server cluster**, it will be the responsibility of the installer to remove the option from the :fixedwidthplain:`domain.xml` of every node except the one intended to be the timer server. We also recommend that the following entry in the :fixedwidthplain:`domain.xml`: ```` is changed back to ```` on all the non-timer server nodes. Similarly, this option is automatically set by the installer script. Changing it back to the default setting on a server that doesn't need to run the timer will prevent a potential race condition, where multiple servers try to get a lock on the timer database. +**Note** that for the timer to work, the version of the PostgreSQL JDBC driver your instance is using must match the version of your PostgreSQL database. See the 'Timer not working' section of the :doc:`/admin/troubleshooting` guide. + Harvesting Timers ----------------- These timers are created when scheduled harvesting is enabled by a local admin user (via the "Manage Harvesting Clients" page). -In a multi-node cluster, all these timers will be created on the dedicated timer node (and not necessarily on the node where the harvesting clients was created and/or saved). +In a multi-node cluster, all these timers will be created on the dedicated timer node (and not necessarily on the node where the harvesting clients were created and/or saved). -A timer will be automatically removed, when a harvesting client with an active schedule is deleted, or if the schedule is turned off for an existing client. +A timer will be automatically removed when a harvesting client with an active schedule is deleted, or if the schedule is turned off for an existing client. Metadata Export Timer --------------------- This timer is created automatically whenever the application is deployed or restarted. There is no admin user-accessible configuration for this timer. -This timer runs a daily job that tries to export all the local, published datasets that haven't been exported yet, in all the supported metdata formats, and cache the results on the filesystem. (Note that, normally, an export will happen automatically whenever a dataset is published. So this scheduled job is there to catch any datasets for which that export did not succeed, for one reason or another). Also, since this functionality has been added in version 4.5: if you are upgrading from a previous version, none of your datasets are exported yet. So the first time this job runs, it will attempt to export them all. +This timer runs a daily job that tries to export all the local, published datasets that haven't been exported yet, in all supported metadata formats, and cache the results on the filesystem. (Note that normally an export will happen automatically whenever a dataset is published. This scheduled job is there to catch any datasets for which that export did not succeed, for one reason or another). Also, since this functionality has been added in version 4.5: if you are upgrading from a previous version, none of your datasets are exported yet. So the first time this job runs, it will attempt to export them all. This daily job will also update all the harvestable OAI sets configured on your server, adding new and/or newly published datasets or marking deaccessioned datasets as "deleted" in the corresponding sets as needed. @@ -47,4 +49,4 @@ This job is automatically scheduled to run at 2AM local time every night. If rea Known Issues ------------ -We've got several reports of an intermittent issue where the applicaiton fails to deploy with the error message "EJB Timer Service is not available." Please see the :doc:`/admin/troubleshooting` section of this guide for a workaround. \ No newline at end of file +We've received several reports of an intermittent issue where the application fails to deploy with the error message "EJB Timer Service is not available." Please see the :doc:`/admin/troubleshooting` section of this guide for a workaround. diff --git a/doc/sphinx-guides/source/admin/troubleshooting.rst b/doc/sphinx-guides/source/admin/troubleshooting.rst index fb7ed8a8326..662060b7438 100644 --- a/doc/sphinx-guides/source/admin/troubleshooting.rst +++ b/doc/sphinx-guides/source/admin/troubleshooting.rst @@ -38,3 +38,23 @@ Note that it may or may not work on your system, so it is provided as an example .. literalinclude:: ../_static/util/clear_timer.sh +Timer not working +----------------- + +Dataverse relies on EJB timers to perform scheduled tasks: harvesting from remote servers, updating the local OAI sets and running metadata exports. (See :doc:`timers` for details.) If these scheduled jobs are not running on your server, this may be the result of the incompatibility between the version of PostgreSQL database you are using, and PostgreSQL JDBC driver in use by your instance of Glassfish. The symptoms: + +If you are seeing the following in your server.log... + +:fixedwidthplain:`Handling timeout on` ... + +followed by an Exception stack trace with these lines in it: + +:fixedwidthplain:`Internal Exception: java.io.StreamCorruptedException: invalid stream header` ... + +:fixedwidthplain:`Exception Description: Could not deserialize object from byte array` ... + + +... it most likely means that it is the JDBC driver incompatibility that's preventing the timer from working correctly. +Make sure you install the correct version of the driver. For example, if you are running the version 9.3 of PostgreSQL, make sure you have the driver postgresql-9.3-1104.jdbc4.jar in your :fixedwidthplain:`/glassfish/lib` directory. Go `here `_ +to download the correct version of the driver. If you have an older driver in glassfish/lib, make sure to remove it, replace it with the new version and restart Glassfish. (You may need to remove the entire contents of :fixedwidthplain:`/glassfish/domains/domain1/generated` before you start Glassfish). + diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index 7cce55b81db..97aca840440 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -152,7 +152,7 @@ Delete the dataset whose id is passed:: GET http://$SERVER/api/datasets/export?exporter=ddi&persistentId=$persistentId -.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, and ``dataverse_json``. +.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org`` , and ``dataverse_json``. |CORS| Lists all the file metadata, for the given dataset and version:: @@ -170,6 +170,10 @@ Updates the current draft version of dataset ``$id``. If the dataset does not ha PUT http://$SERVER/api/datasets/$id/versions/:draft?key=$apiKey +Moves a dataset whose id is passed to a dataverse whose alias is passed. Only accessible to superusers. :: + + POST http://$SERVER/api/datasets/$id/move/$alias?key=$apiKey + Publishes the dataset whose id is passed. The new dataset version number is determined by the most recent version number and the ``type`` parameter. Passing ``type=minor`` increases the minor version number (2.3 is updated to 2.4). Passing ``type=major`` increases the major version number (2.3 is updated to 3.0). :: POST http://$SERVER/api/datasets/$id/actions/:publish?type=$type&key=$apiKey @@ -437,13 +441,6 @@ Place this ``user-add.json`` file in your current directory and run the followin curl -d @user-add.json -H "Content-type:application/json" "$SERVER_URL/api/builtin-users?password=$NEWUSER_PASSWORD&key=$BUILTIN_USERS_KEY" -Retrieving the API Token of a Builtin User -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To retrieve the API token of a builtin user, given that user's password, use the curl command below:: - - curl "$SERVER_URL/api/builtin-users/$DV_USER_NAME/api-token?password=$DV_USER_PASSWORD" - Roles ~~~~~ @@ -525,7 +522,10 @@ For now, only the value for the ``:DatasetPublishPopupCustomText`` setting from GET http://$SERVER/api/info/settings/:DatasetPublishPopupCustomText +Get API Terms of Use. The response contains the text value inserted as API Terms of use which uses the database setting ``:ApiTermsOfUse``:: + GET http://$SERVER/api/info/apiTermsOfUse + Metadata Blocks ~~~~~~~~~~~~~~~ @@ -777,30 +777,6 @@ List a role assignee (i.e. a user or a group):: The ``$identifier`` should start with an ``@`` if it's a user. Groups start with ``&``. "Built in" users and groups start with ``:``. Private URL users start with ``#``. -IpGroups -^^^^^^^^ - -Lists all the ip groups:: - - GET http://$SERVER/api/admin/groups/ip - -Adds a new ip group. POST data should specify the group in JSON format. Examples are available at the ``data`` folder. Using this method, an IP Group is always created, but its ``alias`` might be different than the one appearing in the -JSON file, to ensure it is unique. :: - - POST http://$SERVER/api/admin/groups/ip - -Creates or updates the ip group ``$groupAlias``. :: - - POST http://$SERVER/api/admin/groups/ip/$groupAlias - -Returns a the group in a JSON format. ``$groupIdtf`` can either be the group id in the database (in case it is numeric), or the group alias. :: - - GET http://$SERVER/api/admin/groups/ip/$groupIdtf - -Deletes the group specified by ``groupIdtf``. ``groupIdtf`` can either be the group id in the database (in case it is numeric), or the group alias. Note that a group can be deleted only if there are no roles assigned to it. :: - - DELETE http://$SERVER/api/admin/groups/ip/$groupIdtf - Saved Search ^^^^^^^^^^^^ diff --git a/doc/sphinx-guides/source/api/search.rst b/doc/sphinx-guides/source/api/search.rst index 8a0524c2d43..bb8e268e698 100755 --- a/doc/sphinx-guides/source/api/search.rst +++ b/doc/sphinx-guides/source/api/search.rst @@ -21,20 +21,21 @@ Please note that in Dataverse 4.3 and older the "citation" field wrapped the per Parameters ---------- -============== ======= =========== -Name Type Description -============== ======= =========== -q string The search term or terms. Using "title:data" will search only the "title" field. "*" can be used as a wildcard either alone or adjacent to a term (i.e. "bird*"). For example, https://demo.dataverse.org/api/search?q=title:data -type string Can be either "dataverse", "dataset", or "file". Multiple "type" parameters can be used to include multiple types (i.e. ``type=dataset&type=file``). If omitted, all types will be returned. For example, https://demo.dataverse.org/api/search?q=*&type=dataset -subtree string The identifier of the dataverse to which the search should be narrowed. The subtree of this dataverse and all its children will be searched. For example, https://demo.dataverse.org/api/search?q=data&subtree=birds -sort string The sort field. Supported values include "name" and "date". See example under "order". -order string The order in which to sort. Can either be "asc" or "desc". For example, https://demo.dataverse.org/api/search?q=data&sort=name&order=asc -per_page int The number of results to return per request. The default is 10. The max is 1000. See :ref:`iteration example `. -start int A cursor for paging through search results. See :ref:`iteration example `. -show_relevance boolean Whether or not to show details of which fields were matched by the query. False by default. See :ref:`advanced search example `. -show_facets boolean Whether or not to show facets that can be operated on by the "fq" parameter. False by default. See :ref:`advanced search example `. -fq string A filter query on the search term. Multiple "fq" parameters can be used. See :ref:`advanced search example `. -============== ======= =========== +=============== ======= =========== +Name Type Description +=============== ======= =========== +q string The search term or terms. Using "title:data" will search only the "title" field. "*" can be used as a wildcard either alone or adjacent to a term (i.e. "bird*"). For example, https://demo.dataverse.org/api/search?q=title:data +type string Can be either "dataverse", "dataset", or "file". Multiple "type" parameters can be used to include multiple types (i.e. ``type=dataset&type=file``). If omitted, all types will be returned. For example, https://demo.dataverse.org/api/search?q=*&type=dataset +subtree string The identifier of the dataverse to which the search should be narrowed. The subtree of this dataverse and all its children will be searched. For example, https://demo.dataverse.org/api/search?q=data&subtree=birds +sort string The sort field. Supported values include "name" and "date". See example under "order". +order string The order in which to sort. Can either be "asc" or "desc". For example, https://demo.dataverse.org/api/search?q=data&sort=name&order=asc +per_page int The number of results to return per request. The default is 10. The max is 1000. See :ref:`iteration example `. +start int A cursor for paging through search results. See :ref:`iteration example `. +show_relevance boolean Whether or not to show details of which fields were matched by the query. False by default. See :ref:`advanced search example `. +show_facets boolean Whether or not to show facets that can be operated on by the "fq" parameter. False by default. See :ref:`advanced search example `. +fq string A filter query on the search term. Multiple "fq" parameters can be used. See :ref:`advanced search example `. +show_entity_ids boolean Whether or not to show the database IDs of the search results (for developer use). +=============== ======= =========== Basic Search Example -------------------- diff --git a/doc/sphinx-guides/source/conf.py b/doc/sphinx-guides/source/conf.py index 0efeed88168..e2951dcd99a 100755 --- a/doc/sphinx-guides/source/conf.py +++ b/doc/sphinx-guides/source/conf.py @@ -64,9 +64,9 @@ # built documents. # # The short X.Y version. -version = '4.8.1' +version = '4.8.5' # The full version, including alpha/beta/rc tags. -release = '4.8.1' +release = '4.8.5' # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. diff --git a/doc/sphinx-guides/source/developers/big-data-support.rst b/doc/sphinx-guides/source/developers/big-data-support.rst index 5136a1707df..fa37176e6c3 100644 --- a/doc/sphinx-guides/source/developers/big-data-support.rst +++ b/doc/sphinx-guides/source/developers/big-data-support.rst @@ -112,7 +112,7 @@ Configuring the RSAL Mock Info for configuring the RSAL Mock: https://github.com/sbgrid/rsal/tree/master/mocks -Also, to configure Dataverse to use the new workflow you must do the following: +Also, to configure Dataverse to use the new workflow you must do the following (see also the section below on workflows): 1. Configure the RSAL URL: @@ -160,3 +160,99 @@ To specify replication sites that appear in rsync URLs: In the GUI, this is called "Local Access". It's where you can compute on files on your cluster. ``curl http://localhost:8080/api/admin/settings/:LocalDataAccessPath -X PUT -d "/programs/datagrid"`` + +Workflows +--------- + +Dataverse can perform two sequences of actions when datasets are published: one prior to publishing (marked by a ``PrePublishDataset`` trigger), and one after the publication has succeeded (``PostPublishDataset``). The pre-publish workflow is useful for having an external system prepare a dataset for being publicly accessed (a possibly lengthy activity that requires moving files around, uploading videos to a streaming server, etc.), or to start an approval process. A post-publish workflow might be used for sending notifications about the newly published dataset. + +Workflow steps are created using *step providers*. Dataverse ships with an internal step provider that offers some basic functionality, and with the ability to load 3rd party step providers. This allows installations to implement functionality they need without changing the Dataverse source code. + +Steps can be internal (say, writing some data to the log) or external. External steps involve Dataverse sending a request to an external system, and waiting for the system to reply. The wait period is arbitrary, and so allows the external system unbounded operation time. This is useful, e.g., for steps that require human intervension, such as manual approval of a dataset publication. + +The external system reports the step result back to dataverse, by sending a HTTP ``POST`` command to ``api/workflows/{invocation-id}``. The body of the request is passed to the paused step for further processing. + +If a step in a workflow fails, Dataverse make an effort to roll back all the steps that preceeded it. Some actions, such as writing to the log, cannot be rolled back. If such an action has a public external effect (e.g. send an EMail to a mailing list) it is advisable to put it in the post-release workflow. + +.. tip:: + For invoking external systems using a REST api, Dataverse's internal step + provider offers a step for sending and receiving customizable HTTP requests. + It's called *http/sr*, and is detailed below. + +Administration +~~~~~~~~~~~~~~ + +A Dataverse instance stores a set of workflows in its database. Workflows can be managed using the ``api/admin/workflows/`` endpoints of the :doc:`/api/native-api`. Sample workflow files are available in ``scripts/api/data/workflows``. + +At the moment, defining a workflow for each trigger is done for the entire instance, using the endpoint ``api/admin/workflows/default/«trigger type»``. + +In order to prevent unauthorized resuming of workflows, Dataverse maintains a "white list" of IP addresses from which resume requests are honored. This list is maintained using the ``/api/admin/workflows/ip-whitelist`` endpoint of the :doc:`/api/native-api`. By default, Dataverse honors resume requests from localhost only (``127.0.0.1;::1``), so set-ups that use a single server work with no additional configuration. + + +Available Steps +~~~~~~~~~~~~~~~ + +Dataverse has an internal step provider, whose id is ``:internal``. It offers the following steps: + +log ++++ + +A step that writes data about the current workflow invocation to the instance log. It also writes the messages in its ``parameters`` map. + +.. code:: json + + { + "provider":":internal", + "stepType":"log", + "parameters": { + "aMessage": "message content", + "anotherMessage": "message content, too" + } + } + + +pause ++++++ + +A step that pauses the workflow. The workflow is paused until a POST request is sent to ``/api/workflows/{invocation-id}``. + +.. code:: json + + { + "provider":":internal", + "stepType":"pause" + } + + +http/sr ++++++++ + +A step that sends a HTTP request to an external system, and then waits for a response. The response has to match a regular expression specified in the step parameters. The url, content type, and message body can use data from the workflow context, using a simple markup language. This step has specific parameters for rollback. + +.. code:: json + + { + "provider":":internal", + "stepType":"http/sr", + "parameters": { + "url":"http://localhost:5050/dump/${invocationId}", + "method":"POST", + "contentType":"text/plain", + "body":"START RELEASE ${dataset.id} as ${dataset.displayName}", + "expectedResponse":"OK.*", + "rollbackUrl":"http://localhost:5050/dump/${invocationId}", + "rollbackMethod":"DELETE ${dataset.id}" + } + } + +Available variables are: + +* ``invocationId`` +* ``dataset.id`` +* ``dataset.identifier`` +* ``dataset.globalId`` +* ``dataset.displayName`` +* ``dataset.citation`` +* ``minorVersion`` +* ``majorVersion`` +* ``releaseStatus`` diff --git a/doc/sphinx-guides/source/developers/dev-environment.rst b/doc/sphinx-guides/source/developers/dev-environment.rst index 167cc7ba153..4e93fa05471 100755 --- a/doc/sphinx-guides/source/developers/dev-environment.rst +++ b/doc/sphinx-guides/source/developers/dev-environment.rst @@ -29,7 +29,7 @@ As a `Java Enterprise Edition + +In the OpenShift web interface you should see a link that looks something like http://dataverse-project1.192.168.99.100.nip.io but the IP address will vary and will match the output of ``minishift ip``. Eventually, after deployment is complete, the Dataverse web interface will appear at this URL and you will be able to log in with the username "dataverseAdmin" and the password "admin". + +Another way to verify that Dataverse has been succesfully deployed is to make sure that the Dataverse "info" API endpoint returns a version (note that ``minishift ip`` is used because the IP address will vary): + +``curl http://dataverse-project1.`minishift ip`.nip.io/api/info/version`` + +From the perspective of OpenShift and the ``openshift.json`` config file, the HTTP link to Dataverse in called a route. See also documentation for ``oc expose``. + +Troubleshooting +~~~~~~~~~~~~~~~ + +Here are some tips on troubleshooting your deployment of Dataverse to Minishift. + +Check Status of Dataverse Deployment to Minishift +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``oc status`` + +Once images have been downloaded from Docker Hub, the output below will change from ``Pulling`` to ``Pulled``. + +``oc get events | grep Pull`` + +This is a deep dive: + +``oc get all`` + +Review Logs of Dataverse Deployment to Minishift +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Logs are provided in the web interface to each of the deployment configurations. The URLs should be something like this (but the IP address) will vary and you should click "View Log". The installation of Dataverse is done within the one Glassfish deployment configuration: + +- https://192.168.99.100:8443/console/project/project1/browse/dc/dataverse-glassfish +- https://192.168.99.100:8443/console/project/project1/browse/dc/dataverse-postgresql +- https://192.168.99.100:8443/console/project/project1/browse/dc/dataverse-solr + +You can also see logs from each of the components (Glassfish, PostgreSQL, and Solr) from the command line with ``oc logs`` like this (just change the ``grep`` at the end): + +``oc logs $(oc get po -o json | jq '.items[] | select(.kind=="Pod").metadata.name' -r | grep glassfish)`` + +Get a Shell (ssh/rsh) on Containers Deployed to Minishift +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You can get a shell on any of the containers for each of the components (Glassfish, PostgreSQL, and Solr) with ``oc rc`` (just change the ``grep`` at the end): + +``oc rsh $(oc get po -o json | jq '.items[] | select(.kind=="Pod").metadata.name' -r | grep glassfish)`` + +From the ``rsh`` prompt of the Glassfish container you could run something like the following to make sure that Dataverse is running on port 8080: + +``curl http://localhost:8080/api/info/version`` + +Cleaning up +~~~~~~~~~~~ + +If you simply wanted to try out Dataverse on Minishift and want to clean up, you can run ``oc delete project project1`` to delete the project or ``minishift stop`` and ``minishift delete`` to delete the entire Minishift VM and all the Docker containers inside it. + +Making Changes +~~~~~~~~~~~~~~ + +Making Changes to Docker Images +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you're interested in using Minishift for development and want to change the Dataverse code, you will need to get set up to create Docker images based on your changes and push them to a Docker registry such as Docker Hub (or Minishift's internal registry, if you can get that working, mentioned below). See the section below on Docker for details. + +Using Minishift for day to day Dataverse development might be something we want to investigate in the future. These blog posts talk about developing Java applications using Minishift/OpenShift: + +- https://blog.openshift.com/fast-iterative-java-development-on-openshift-kubernetes-using-rsync/ +- https://blog.openshift.com/debugging-java-applications-on-openshift-kubernetes/ + +Making Changes to the OpenShift Config +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you are interested in changing the OpenShift config file for Dataverse at ``conf/openshift/openshift.json`` note that in many cases once you have Dataverse running in Minishift you can use ``oc process`` and ``oc apply`` like this (but please note that some errors and warnings are expected): + +``oc process -f conf/openshift/openshift.json | oc apply -f -`` + +The slower way to iterate on the ``openshift.json`` file is to delete the project and re-create it. + +Running Containers to Run as Root in Minishift +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It is **not** recommended to run containers as root in Minishift because for security reasons OpenShift doesn't support running containers as root. However, it's good to know how to allow containers to run as root in case you need to work on a Docker image to make it run as non-root. + +For more information on improving Docker images to run as non-root, see "Support Arbitrary User IDs" at https://docs.openshift.org/latest/creating_images/guidelines.html#openshift-origin-specific-guidelines + +Let's say you have a container that you suspect works fine when it runs as root. You want to see it working as-is before you start hacking on the Dockerfile and entrypoint file. You can configure Minishift to allow containers to run as root with this command: + +``oc adm policy add-scc-to-user anyuid -z default --as system:admin`` + +Once you are done testing you can revert Minishift back to not allowing containers to run as root with this command: + +``oc adm policy remove-scc-from-user anyuid -z default --as system:admin`` + +Minishift Resources +~~~~~~~~~~~~~~~~~~~ + +The following resources might be helpful. + +- https://blog.openshift.com/part-1-from-app-to-openshift-runtimes-and-templates/ +- https://blog.openshift.com/part-2-creating-a-template-a-technical-walkthrough/ +- https://docs.openshift.com/enterprise/3.0/architecture/core_concepts/templates.html + +Docker +------ + +From the Dataverse perspective, Docker is important for a few reasons: + +- We are thankful that NDS Labs did the initial work to containerize Dataverse and include it in the "workbench" we mention in the :doc:`/installation/prep` section of the Installation Guide. The workbench allows people to kick the tires on Dataverse. +- There is interest from the community in running Dataverse on OpenShift and some initial work has been done to get Dataverse running on Minishift in Docker containers. Minishift makes use of Docker images on Docker Hub. To build new Docker images and push them to Docker Hub, you'll need to install Docker. The main issue to follow is https://github.com/IQSS/dataverse/issues/4040 . +- Docker may aid in testing efforts if we can easily spin up Docker images based on code in pull requests and run the full integration suite against those images. See the :doc:`testing` section for more information on integration tests. + +Installing Docker +~~~~~~~~~~~~~~~~~ + +On Linux, you can probably get Docker from your package manager. + +On Mac, download the ``.dmg`` from https://www.docker.com and install it. As of this writing is it known as Docker Community Edition for Mac. + +On Windows, FIXME ("Docker Community Edition for Windows" maybe???). + +As explained above, we use Docker images in two different contexts: + +- Testing using an "all in one" Docker image (ephemeral, unpublished) +- Future production use on Minishift/OpenShift/Kubernetes (published to Docker Hub) + +All In One Docker Images for Testing +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The "all in one" Docker files are in ``conf/docker-aio`` and you should follow the readme in that directory for more information on how to use them. + +Future production use on Minishift/OpenShift/Kubernetes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When working with Docker in the context of Minishift, follow the instructions above and make sure you get the Dataverse Docker images running in Minishift before you start messing with them. + +As of this writing, the Dataverse Docker images we publish under https://hub.docker.com/u/iqss/ are highly experimental. They were originally tagged with branch names like ``kick-the-tires`` and as of this writing the ``latest`` tag should be considered highly experimental and not for production use. See https://github.com/IQSS/dataverse/issues/4040 for the latest status and please reach out if you'd like to help! + +Change to the docker directory: + +``cd conf/docker`` + +Edit one of the files: + +``vim dataverse-glassfish/Dockerfile`` + +At this point you want to build the image and run it. We are assuming you want to run it in your Minishift environment. We will be building your image and pushing it to Docker Hub. Then you will be pulling the image down from Docker Hub to run in your Minishift installation. If this sounds inefficient, you're right, but we haven't been able to figure out how to make use of Minishift's built in registry (see below) so we're pushing to Docker Hub instead. + +Log in to Docker Hub with an account that has access to push to the ``iqss`` organization: + +``docker login`` + +(If you don't have access to push to the ``iqss`` organization, you can push elsewhere and adjust your ``openshift.json`` file accordingly.) + +Build and push the images to Docker Hub: + +``./build.sh`` + +Note that you will see output such as ``digest: sha256:213b6380e6ee92607db5d02c9e88d7591d81f4b6d713224d47003d5807b93d4b`` that should later be reflected in Minishift to indicate that you are using the latest image you just pushed to Docker Hub. + +You can get a list of all repos under the ``iqss`` organization with this: + +``curl https://hub.docker.com/v2/repositories/iqss/`` + +To see a specific repo: + +``curl https://hub.docker.com/v2/repositories/iqss/dataverse-glassfish/`` + +Known Issues with Dataverse Images on Docker Hub +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Again, Dataverse Docker images on Docker Hub are highly experimental at this point. As of this writing, their purpose is primarily for kicking the tires on Dataverse. Here are some known issues: + +- The Dataverse installer is run in the entrypoint script every time you run the image. Ideally, Dataverse would be installed in the Dockerfile instead. Dataverse is being installed in the entrypoint script because it needs PosgreSQL to be up already so that database tables can be created when the war file is deployed. +- The storage should be abstracted. Storage of data files and PostgreSQL data. Probably Solr data. +- Better tuning of memory by examining ``/sys/fs/cgroup/memory/memory.limit_in_bytes`` and incorporating this into the Dataverse installation script. +- Only a single Glassfish server can be used. See "Dedicated timer server in a Dataverse server cluster" in the :doc:`/admin/timers` section of the Installation Guide. +- Only a single PostgreSQL server can be used. +- Only a single Solr server can be used. + +Get Set Up to Push Docker Images to Minishift Registry +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +FIXME https://docs.openshift.org/latest/minishift/openshift/openshift-docker-registry.html indicates that it should be possible to make use of the builtin registry in Minishift while iterating on Docker images but you may get "unauthorized: authentication required" when trying to push to it as reported at https://github.com/minishift/minishift/issues/817 so until we figure this out, you must push to Docker Hub instead. Run ``docker login`` and use the ``conf/docker/build.sh`` script to push Docker images you create to https://hub.docker.com/u/iqss/ + +If you want to troubleshoot this, take a close look at the ``docker login`` command you're using to make sure the OpenShift token is being sent. + +An alternative to using the the Minishift Registry is to do a local build. This isn't documented but should work within Minishift because it's an all-in-one OpenShift environment. The steps at a high level are to ssh into the Minishift VM and then do a ``docker build``. For a stateful set, the image pull policy should be never. + ---- Previous: :doc:`intro` | Next: :doc:`version-control` diff --git a/doc/sphinx-guides/source/developers/intro.rst b/doc/sphinx-guides/source/developers/intro.rst index 633297e74b7..d3a48abf2b1 100755 --- a/doc/sphinx-guides/source/developers/intro.rst +++ b/doc/sphinx-guides/source/developers/intro.rst @@ -12,20 +12,21 @@ Intended Audience This guide is intended primarily for developers who want to work on the main Dataverse code base at https://github.com/IQSS/dataverse -To get started, you'll want to set up your :doc:`dev-environment` and make sure you understand the branching strategy described in the :doc:`version-control` section. :doc:`testing` is expected. Opinions about :doc:`coding-style` are welcome! +(See "Related Projects" below for other code you can work on!) + +To get started, you'll want to set up your :doc:`dev-environment` and make sure you understand the branching strategy described in the :doc:`version-control` section and how to make a pull request. :doc:`testing` is expected. Opinions about :doc:`coding-style` are welcome! If you have any questions at all, please reach out to other developers per https://github.com/IQSS/dataverse/blob/master/CONTRIBUTING.md Roadmap ------- -For the Dataverse development roadmap, please see https://github.com/IQSS/dataverse/milestones +For the Dataverse development roadmap, please see https://dataverse.org/goals-roadmap-and-releases -The `Contributing to Dataverse `_ document in the root of the source tree provides guidance on: +Kanban Board +------------ -- the use of `labels `_ to organize and prioritize `issues `_ -- making pull requests -- how to contact the development team +You can get a sense of what's currently in flight (in dev, in QA, etc.) by looking at https://waffle.io/IQSS/dataverse Related Guides -------------- @@ -39,13 +40,15 @@ Related Projects As a developer, you also may be interested in these projects related to Dataverse: +- External Tools - add additional features to Dataverse: See the :doc:`/installation/external-tools` section of the Installation Guide. +- Dataverse API client libraries - use Dataverse APIs from various languages: :doc:`/api/client-libraries` - Miniverse - expose metrics from a Dataverse database: https://github.com/IQSS/miniverse -- `Zelig `_ (R) - run statistical models on files uploaded to Dataverse: https://github.com/IQSS/Zelig -- `TwoRavens `_ (Javascript) - a `d3.js `_ interface for exploring data and running Zelig models: https://github.com/IQSS/TwoRavens +- Configuration management scripts - Ansible, Puppet, etc.: See "Advanced Installation" in the :doc:`/installation/prep` section of the Installation Guide. - :doc:`/developers/unf/index` (Java) - a Universal Numerical Fingerprint: https://github.com/IQSS/UNF -- `DataTags `_ (Java and Scala) - tag datasets with privacy levels: https://github.com/IQSS/DataTags - GeoConnect (Python) - create a map by uploading files to Dataverse: https://github.com/IQSS/geoconnect -- Dataverse API client libraries - use Dataverse APIs from various languages: :doc:`/api/client-libraries` +- `DataTags `_ (Java and Scala) - tag datasets with privacy levels: https://github.com/IQSS/DataTags +- `TwoRavens `_ (Javascript) - a `d3.js `_ interface for exploring data and running Zelig models: https://github.com/IQSS/TwoRavens +- `Zelig `_ (R) - run statistical models on files uploaded to Dataverse: https://github.com/IQSS/Zelig - Third party apps - make use of Dataverse APIs: :doc:`/api/apps` - chat.dataverse.org - chat interface for Dataverse users and developers: https://github.com/IQSS/chat.dataverse.org - [Your project here] :) diff --git a/doc/sphinx-guides/source/developers/selinux.rst b/doc/sphinx-guides/source/developers/selinux.rst index 582510c5847..d7f5b0d7519 100644 --- a/doc/sphinx-guides/source/developers/selinux.rst +++ b/doc/sphinx-guides/source/developers/selinux.rst @@ -8,7 +8,7 @@ SELinux Introduction ------------ -The ``shibboleth.te`` file below that is mentioned in the :doc:`/installation/shibboleth` section of the Installation Guide was created on CentOS 6 as part of https://github.com/IQSS/dataverse/issues/3406 but may need to be revised for future versions of RHEL/CentOS. The file is versioned with the docs and can be found in the following location: +The ``shibboleth.te`` file below that is mentioned in the :doc:`/installation/shibboleth` section of the Installation Guide was created on CentOS 6 as part of https://github.com/IQSS/dataverse/issues/3406 but may need to be revised for future versions of RHEL/CentOS (pull requests welcome!). The file is versioned with the docs and can be found in the following location: ``doc/sphinx-guides/source/_static/installation/files/etc/selinux/targeted/src/policy/domains/misc/shibboleth.te`` diff --git a/doc/sphinx-guides/source/developers/testing.rst b/doc/sphinx-guides/source/developers/testing.rst index 229ed63ac3e..eeb2512d76d 100755 --- a/doc/sphinx-guides/source/developers/testing.rst +++ b/doc/sphinx-guides/source/developers/testing.rst @@ -94,6 +94,8 @@ Unit tests are run automatically on every build, but dev environments and server The :doc:`dev-environment` section currently refers developers here for advice on getting set up to run REST Assured tests, but we'd like to add some sort of "dev" flag to the installer to put Dataverse in "insecure" mode, with lots of scary warnings that this dev mode should not be used in production. +The instructions below assume a relatively static dev environment on a Mac. There is a newer "all in one" Docker-based approach documented in the :doc:`dev-environment` section under "Docker" that you may like to play with as well. + The Burrito Key ^^^^^^^^^^^^^^^ @@ -166,15 +168,34 @@ Once installed, you may run commands with ``mvn [options] [] [` mentioned in ``conf/docker-aio/readme.txt`` for the current list of IT tests that are expected to pass. Here's a dump of that file: + +.. literalinclude:: ../../../../conf/docker-aio/run-test-suite.sh + Future Work ----------- @@ -196,6 +217,7 @@ Future Work on Integration Tests - Attempt to use @openscholar approach for running integration tests using Travis https://github.com/openscholar/openscholar/blob/SCHOLAR-3.x/.travis.yml (probably requires using Ubuntu rather than CentOS) - Generate code coverage reports for **integration** tests: https://github.com/pkainulainen/maven-examples/issues/3 and http://www.petrikainulainen.net/programming/maven/creating-code-coverage-reports-for-unit-and-integration-tests-with-the-jacoco-maven-plugin/ - Consistent logging of API Tests. Show test name at the beginning and end and status codes returned. +- expected passing and known/expected failing integration tests: https://github.com/IQSS/dataverse/issues/4438 Browser-Based Testing ~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/sphinx-guides/source/developers/unf/index.rst b/doc/sphinx-guides/source/developers/unf/index.rst index 0651891242d..956e2d42dab 100644 --- a/doc/sphinx-guides/source/developers/unf/index.rst +++ b/doc/sphinx-guides/source/developers/unf/index.rst @@ -27,10 +27,7 @@ with Dataverse 2.0 and throughout the 3.* lifecycle, UNF v.5 UNF v.6. Two parallel implementation, in R and Java, will be available, for cross-validation. -Learn more: Micah Altman, Jeff Gill and Michael McDonald, 2003, -`Numerical Issues in Statistical Computing for the Social Scientist -`_, -New York: John Wiley. +Learn more: Micah Altman and Gary King. 2007. “A Proposed Standard for the Scholarly Citation of Quantitative Data.” D-Lib Magazine, 13. Publisher’s Version Copy at http://j.mp/2ovSzoT **Contents:** diff --git a/doc/sphinx-guides/source/index.rst b/doc/sphinx-guides/source/index.rst index a06f75cb120..1757760ab28 100755 --- a/doc/sphinx-guides/source/index.rst +++ b/doc/sphinx-guides/source/index.rst @@ -3,10 +3,10 @@ You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -Dataverse 4.8.1 Guides +Dataverse 4.8.5 Guides ====================== -These guides are for the most recent version of Dataverse. For the guides for **version 4.7.1** please go `here `_. +These guides are for the most recent version of Dataverse. For the guides for **version 4.8.4** please go `here `_. .. toctree:: :glob: @@ -14,12 +14,11 @@ These guides are for the most recent version of Dataverse. For the guides for ** :maxdepth: 2 user/index - installation/index + admin/index api/index + installation/index developers/index style/index - admin/index - workflows How the Guides Are Organized ============================= diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 31bbed45c07..f9fed23ecdc 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -284,27 +284,32 @@ You can configure this redirect properly in your cloud environment to generate a Amazon S3 Storage +++++++++++++++++ -For institutions and organizations looking to use Amazon's S3 cloud storage for their installation, this can be set up manually through creation of a credentials file or automatically via the aws console commands. +For institutions and organizations looking to use Amazon's S3 cloud storage for their installation, this can be set up manually through creation of the credentials and config files or automatically via the aws console commands. You'll need an AWS account with an associated S3 bucket for your installation to use. From the S3 management console (e.g. ``_), you can poke around and get familiar with your bucket. We recommend using IAM (Identity and Access Management) to create a user with full S3 access and nothing more, for security reasons. See ``_ for more info on this process. -Make note of the bucket's name and the region its data is hosted in. Dataverse and the aws SDK rely on the placement of a key file located in ``~/.aws/credentials``, which can be generated via either of these two methods. +Make note of the bucket's name and the region its data is hosted in. Dataverse and the AWS SDK make use of "AWS credentials profile file" and "AWS config profile file" located in ``~/.aws/`` where ``~`` is the home directory of the user you run Glassfish as. This file can be generated via either of two methods described below. It's also possible to use IAM Roles rather than the credentials file. Please note that in this case you will need anyway the config file to specify the region. -Setup aws manually -^^^^^^^^^^^^^^^^^^ +Set Up credentials File Manually +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -To create ``credentials`` manually, you will need to generate a key/secret key. The first step is to log onto your aws web console (e.g. ``_). If you have created a user in AWS IAM, you can click on that user and generate the keys needed for dataverse. +To create the ``credentials`` file manually, you will need to generate a key/secret key. The first step is to log onto your aws web console (e.g. ``_). If you have created a user in AWS IAM, you can click on that user and generate the keys needed for Dataverse. -Once you have acquired the keys, they need to be added to``credentials``. The format for credentials is as follows: +Once you have acquired the keys, they need to be added to the ``credentials`` file. The format for credentials is as follows: | ``[default]`` | ``aws_access_key_id = `` | ``aws_secret_access_key = `` -Place this file in a folder named ``.aws`` under the home directory for the user running your dataverse installation. +You must also specify the AWS region, in the ``config`` file, for example: -Setup aws via command line tools -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +| ``[default]`` +| ``region = us-east-1`` + +Place these two files in a folder named ``.aws`` under the home directory for the user running your Dataverse Glassfish instance. (From the `AWS Command Line Interface Documentation `_: "In order to separate credentials from less sensitive options, region and output format are stored in a separate file named config in the same folder") + +Set Up Access Configuration Via Command Line Tools +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Begin by installing the CLI tool `pip `_ to install the `AWS command line interface `_ if you don't have it. @@ -314,9 +319,14 @@ First, we'll get our access keys set up. If you already have your access keys co ``aws configure`` -You'll be prompted to enter your Access Key ID and secret key, which should be issued to your AWS account. The subsequent config steps after the access keys are up to you. For reference, these keys are stored in ``~/.aws/credentials``. +You'll be prompted to enter your Access Key ID and secret key, which should be issued to your AWS account. The subsequent config steps after the access keys are up to you. For reference, the keys will be stored in ``~/.aws/credentials``, and your AWS access region in ``~/.aws/config``. -Configure dataverse to use aws/S3 +Using an IAM Role with EC2 +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If you are hosting Dataverse on an AWS EC2 instance alongside storage in S3, it is possible to use IAM Roles instead of the credentials file (the file at ``~/.aws/credentials`` mentioned above). Please note that you will still need the ``~/.aws/config`` file to specify the region. For more information on this option, see http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html + +Configure Dataverse to Use AWS/S3 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ With your access to your bucket in place, we'll want to navigate to ``/usr/local/glassfish4/glassfish/bin/`` and execute the following ``asadmin`` commands to set up the proper JVM options. Recall that out of the box, Dataverse is configured to use local file storage. You'll need to delete the existing storage driver before setting the new one. @@ -329,6 +339,14 @@ Then, we'll need to identify which S3 bucket we're using. Replace ``your_bucket_ ``./asadmin create-jvm-options "-Ddataverse.files.s3-bucket-name=your_bucket_name"`` +Optionally, you can have users download files from S3 directly rather than having files pass from S3 through Glassfish to your users. To accomplish this, set ``dataverse.files.s3-download-redirect`` to ``true`` like this: + +``./asadmin create-jvm-options "-Ddataverse.files.s3-download-redirect=true"`` + +If you enable ``dataverse.files.s3-download-redirect`` as described above, note that the S3 URLs expire after an hour by default but you can configure the expiration time using the ``dataverse.files.s3-url-expiration-minutes`` JVM option. Here's an example of setting the expiration time to 120 minutes: + +``./asadmin create-jvm-options "-D dataverse.files.s3-url-expiration-minutes=120"`` + Lastly, go ahead and restart your glassfish server. With Dataverse deployed and the site online, you should be able to upload datasets and data files and see the corresponding files in your S3 bucket. Within a bucket, the folder structure emulates that found in local file storage. .. _Branding Your Installation: @@ -385,6 +403,13 @@ Once you have the location of your custom header HTML file, run this curl comman ``curl -X PUT -d '/var/www/dataverse/branding/custom-header.html' http://localhost:8080/api/admin/settings/:HeaderCustomizationFile`` +If you have enabled a custom header or navbar logo, you might prefer to disable the theme of the root dataverse. You can do so by setting ``:DisableRootDataverseTheme`` to ``true`` like this: + +``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:DisableRootDataverseTheme`` + +Please note: Disabling the display of the root dataverse theme also disables your ability to edit it. Remember that dataverse owners can set their dataverses to "inherit theme" from the root. Those dataverses will continue to inherit the root dataverse theme (even though it no longer displays on the root). If you would like to edit the root dataverse theme in the future, you will have to re-enable it first. + + Custom Footer +++++++++++++ @@ -598,6 +623,11 @@ dataverse.handlenet.admprivphrase +++++++++++++++++++++++++++++++++ This JVM setting is also part of **handles** configuration. The Handle.Net installer lets you choose whether to encrypt the admcredfile private key or not. If you do encrypt it, this is the pass phrase that it's encrypted with. +dataverse.timerServer ++++++++++++++++++++++ + +This JVM option is only relevant if you plan to run multiple Glassfish servers for redundancy. Only one Glassfish server can act as the dedicated timer server and for details on promoting or demoting a Glassfish server to handle this responsibility, see :doc:`/admin/timers`. + Database Settings ----------------- @@ -674,6 +704,11 @@ See :ref:`Branding Your Installation` above. See :ref:`Branding Your Installation` above. +:DisableRootDataverseTheme +++++++++++++++++++++++++++ + +See :ref:`Branding Your Installation` above. + :FooterCustomizationFile ++++++++++++++++++++++++ @@ -779,6 +814,7 @@ Specify a URL where users can read your Privacy Policy, linked from the bottom o ++++++++++++++ Specify a URL where users can read your API Terms of Use. +API users can retrieve this URL from the SWORD Service Document or the "info" section of our :doc:`/api/native-api` documentation. ``curl -X PUT -d http://best-practices.dataverse.org/harvard-policies/harvard-api-tou.html http://localhost:8080/api/admin/settings/:ApiTermsOfUse`` @@ -843,10 +879,15 @@ After you've set ``:StatusMessageHeader`` you can also make it clickable to have +++++++++++++++++++++++++ Set `MaxFileUploadSizeInBytes` to "2147483648", for example, to limit the size of files uploaded to 2 GB. + Notes: + - For SWORD, this size is limited by the Java Integer.MAX_VALUE of 2,147,483,647. (see: https://github.com/IQSS/dataverse/issues/2169) + - If the MaxFileUploadSizeInBytes is NOT set, uploads, including SWORD may be of unlimited size. +- For larger file upload sizes, you may need to configure your reverse proxy timeout. If using apache2 (httpd) with Shibboleth, add a timeout to the ProxyPass defined in etc/httpd/conf.d/ssl.conf (which is described in the :doc:`/installation/shibboleth` setup). + ``curl -X PUT -d 2147483648 http://localhost:8080/api/admin/settings/:MaxFileUploadSizeInBytes`` :ZipDownloadLimit @@ -907,14 +948,12 @@ The relative path URL to which users will be sent after signup. The default sett :TwoRavensUrl +++++++++++++ -The location of your TwoRavens installation. Activation of TwoRavens also requires the setting below, ``TwoRavensTabularView`` +The ``:TwoRavensUrl`` option is no longer valid. See :doc:`r-rapache-tworavens` and :doc:`external-tools`. :TwoRavensTabularView +++++++++++++++++++++ -Set ``TwoRavensTabularView`` to true to allow a user to view tabular files via the TwoRavens application. This boolean affects whether a user will see the "Explore" button. - -``curl -X PUT -d true http://localhost:8080/api/admin/settings/:TwoRavensTabularView`` +The ``:TwoRavensTabularView`` option is no longer valid. See :doc:`r-rapache-tworavens` and :doc:`external-tools`. :GeoconnectCreateEditMaps +++++++++++++++++++++++++ @@ -1006,6 +1045,16 @@ or ``curl -X PUT -d hostname.domain.tld/stats http://localhost:8080/api/admin/settings/:PiwikAnalyticsHost`` +:PiwikAnalyticsTrackerFileName +++++++++++++++++++++++++++++++ + +Filename for the 'php' and 'js' tracker files used in the piwik code (piwik.php and piwik.js). +Sometimes these files are renamed in order to prevent ad-blockers (in the browser) to block the piwik tracking code. +This sets the base name (without dot and extension), if not set it defaults to 'piwik'. + +``curl -X PUT -d domainstats http://localhost:8080/api/admin/settings/:PiwikAnalyticsTrackerFileName`` + + :FileFixityChecksumAlgorithm ++++++++++++++++++++++++++++ @@ -1204,3 +1253,10 @@ You can replace the default dataset metadata fields that are displayed above fil ``curl http://localhost:8080/api/admin/settings/:CustomDatasetSummaryFields -X PUT -d 'producer,subtitle,alternativeTitle'`` You have to put the datasetFieldType name attribute in the :CustomDatasetSummaryFields setting for this to work. + +:AllowApiTokenLookupViaApi +++++++++++++++++++++++++++ + +Dataverse 4.8.1 and below allowed API Token lookup via API but for better security this has been disabled by default. Set this to true if you really want the old behavior. + +``curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:AllowApiTokenLookupViaApi`` diff --git a/doc/sphinx-guides/source/installation/external-tools.rst b/doc/sphinx-guides/source/installation/external-tools.rst new file mode 100644 index 00000000000..77c7b9bb1b5 --- /dev/null +++ b/doc/sphinx-guides/source/installation/external-tools.rst @@ -0,0 +1,63 @@ +External Tools +============== + +External tools can provide additional features that are not part of Dataverse itself, such as data exploration. See the "Writing Your Own External Tool" section below for more information on developing your own tool for Dataverse. + +.. contents:: |toctitle| + :local: + +Inventory of External Tools +--------------------------- + +Support for external tools is just getting off the ground but TwoRavens has been converted into an external tool. See the :doc:`/user/data-exploration/tworavens` section of the User Guide for more information on TwoRavens from the user perspective and :doc:`r-rapache-tworavens` for more information on installing TwoRavens. + +- TwoRavens: a system of interlocking statistical tools for data exploration, analysis, and meta-analysis: http://2ra.vn +- Data Explorer: a GUI which lists the variables in a tabular data file allowing searching, charting and cross tabulation analysis. For installation instructions see the README.md file at https://github.com/scholarsportal/Dataverse-Data-Explorer. +- [Your tool here! Please get in touch! :) ] + +Downloading and Adjusting an External Tool Manifest File +-------------------------------------------------------- + +In order to make external tools available within Dataverse, you need to configure Dataverse to be aware of them. + +External tools must be expressed in an external tool manifest file, a specific JSON format Dataverse requires. The author of the external tool may be able to provide you with a JSON file and installation instructions. The JSON file might look like this: + +.. literalinclude:: ../_static/installation/files/root/external-tools/awesomeTool.json + +``type`` is required and must be ``explore`` or ``configure`` to make the tool appear under a button called "Explore" or "Configure", respectively. Currently external tools only operate on tabular files that have been successfully ingested. (For more on ingest, see the :doc:`/user/tabulardataingest/ingestprocess` of the User Guide.) + +In the example above, a mix of required and optional reserved words appear that can be used to insert dynamic values into tools. The supported values are: + +- ``{fileId}`` (required) - The Dataverse database ID of a file the external tool has been launched on. +- ``{siteUrl}`` (optional) - The URL of the Dataverse installation that hosts the file with the fileId above. +- ``{apiToken}`` (optional) - The Dataverse API token of the user launching the external tool, if available. + +Making an External Tool Available in Dataverse +---------------------------------------------- + +If the JSON file were called, for example, :download:`awesomeTool.json <../_static/installation/files/root/external-tools/awesomeTool.json>` you would make any necessary adjustments, as described above, and then make the tool available within Dataverse with the following curl command: + +``curl -X POST -H 'Content-type: application/json' --upload-file awesomeTool.json http://localhost:8080/api/admin/externalTools`` + +Listing all External Tools in Dataverse +--------------------------------------- + +To list all the external tools that are available in Dataverse: + +``curl http://localhost:8080/api/admin/externalTools`` + +Removing an External Tool Available in Dataverse +------------------------------------------------ + +Assuming the external tool database id is "1", remove it with the following command: + +``curl -X DELETE http://localhost:8080/api/admin/externalTools/1`` + +Writing Your Own External Tool +------------------------------ + +If you have an idea for an external tool, please let the Dataverse community know by posting about it on the dataverse-community mailing list: https://groups.google.com/forum/#!forum/dataverse-community + +If you need help with your tool, please feel free to post on the dataverse-dev mailing list: https://groups.google.com/forum/#!forum/dataverse-dev + +Once you've gotten your tool working, please make a pull request to update the list of tools above. diff --git a/doc/sphinx-guides/source/installation/index.rst b/doc/sphinx-guides/source/installation/index.rst index b8423e77ae5..94a0d1231d1 100755 --- a/doc/sphinx-guides/source/installation/index.rst +++ b/doc/sphinx-guides/source/installation/index.rst @@ -20,3 +20,4 @@ Installation Guide geoconnect shibboleth oauth2 + external-tools diff --git a/doc/sphinx-guides/source/installation/installation-main.rst b/doc/sphinx-guides/source/installation/installation-main.rst index 3a4a479093c..12a2eeefee9 100755 --- a/doc/sphinx-guides/source/installation/installation-main.rst +++ b/doc/sphinx-guides/source/installation/installation-main.rst @@ -13,19 +13,19 @@ Running the Dataverse Installer ------------------------------- A scripted, interactive installer is provided. This script will configure your Glassfish environment, create the database, set some required options and start the application. Some configuration tasks will still be required after you run the installer! So make sure to consult the next section. -At this point the installer only runs on RHEL 6 and similar and MacOS X (recommended as the platform for developers). -Generally, the installer has a better chance of succeeding if you run it against a freshly installed Glassfish node that still has all the default configuration settings. In any event, please make sure that it is still configured to accept http connections on port 8080 - because that's where the installer expects to find the application once it's deployed. +As mentioned in the :doc:`prerequisites` section, RHEL/CentOS is the recommended Linux distribution. (The installer is also known to work on Mac OS X for setting up a development environment.) +Generally, the installer has a better chance of succeeding if you run it against a freshly installed Glassfish node that still has all the default configuration settings. In any event, please make sure that it is still configured to accept http connections on port 8080 - because that's where the installer expects to find the application once it's deployed. You should have already downloaded the installer from https://github.com/IQSS/dataverse/releases when setting up and starting Solr under the :doc:`prerequisites` section. Again, it's a zip file with "dvinstall" in the name. Unpack the zip file - this will create the directory ``dvinstall``. -Execute the installer script like this:: +Execute the installer script like this (but first read the note below about not running the installer as root):: - # cd dvinstall - # ./install + $ cd dvinstall + $ ./install **It is no longer necessary to run the installer as root!** @@ -38,7 +38,6 @@ Just make sure the user running the installer has write permission to: The only reason to run Glassfish as root would be to allow Glassfish itself to listen on the default HTTP(S) ports 80 and 443, or any other port below 1024. However, it is simpler and more secure to run Glassfish run on its default port of 8080 and hide it behind an Apache Proxy, via AJP, running on port 80 or 443. This configuration is required if you're going to use Shibboleth authentication. See more discussion on this here: :doc:`shibboleth`.) - The script will prompt you for some configuration values. If this is a test/evaluation installation, it may be possible to accept the default values provided for most of the settings: - Internet Address of your host: localhost @@ -53,29 +52,24 @@ The script will prompt you for some configuration values. If this is a test/eval - Name of the Postgres User: dvnapp - Postgres user password: secret - Remote Solr indexing service: LOCAL -- Will this Dataverse be using TwoRavens application: NOT INSTALLED - Rserve Server: localhost - Rserve Server Port: 6311 - Rserve User Name: rserve - Rserve User Password: rserve +- Administration Email address for the installation; +- Postgres admin password - We'll need it in order to create the database and user for the Dataverse to use, without having to run the installer as root. If you don't know your Postgres admin password, you may simply set the authorization level for localhost to "trust" in the PostgreSQL ``pg_hba.conf`` file (See the PostgreSQL section in the Prerequisites). If this is a production evnironment, you may want to change it back to something more secure, such as "password" or "md5", after the installation is complete. +- Network address of a remote Solr search engine service (if needed) - In most cases, you will be running your Solr server on the same host as the Dataverse application (then you will want to leave this set to the default value of ``LOCAL``). But in a serious production environment you may set it up on a dedicated separate server. If desired, these default values can be configured by creating a ``default.config`` (example :download:`here <../_static/util/default.config>`) file in the installer's working directory with new values (if this file isn't present, the above defaults will be used). This allows the installer to be run in non-interactive mode (with ``./install -y -f > install.out 2> install.err``), which can allow for easier interaction with automated provisioning tools. -**New, as of 4.3:** +All the Glassfish configuration tasks performed by the installer are isolated in the shell script ``dvinstall/glassfish-setup.sh`` (as ``asadmin`` commands). -- Administration Email address for the installation; -- Postgres admin password - We'll need it in order to create the database and user for the Dataverse to use, without having to run the installer as root. If you don't know your Postgres admin password, you may simply set the authorization level for localhost to "trust" in the PostgreSQL ``pg_hba.conf`` file (See the PostgreSQL section in the Prerequisites). If this is a production evnironment, you may want to change it back to something more secure, such as "password" or "md5", after the installation is complete. -- Network address of a remote Solr search engine service (if needed) - In most cases, you will be running your Solr server on the same host as the Dataverse application (then you will want to leave this set to the default value of ``LOCAL``). But in a serious production environment you may set it up on a dedicated separate server. -- The URL of the TwoRavens application GUI, if this Dataverse node will be using a companion TwoRavens installation. Otherwise, leave it set to ``NOT INSTALLED``. +**IMPORTANT:** Please note, that "out of the box" the installer will configure the Dataverse to leave unrestricted access to the administration APIs from (and only from) localhost. Please consider the security implications of this arrangement (anyone with shell access to the server can potentially mess with your Dataverse). An alternative solution would be to block open access to these sensitive API endpoints completely; and to only allow requests supplying a pre-defined "unblock token" (password). If you prefer that as a solution, please consult the supplied script ``post-install-api-block.sh`` for examples on how to set it up. See also "Securing Your Installation" under the :doc:`config` section. The script is to a large degree a derivative of the old installer from DVN 3.x. It is written in Perl. If someone in the community is eager to rewrite it, perhaps in a different language, please get in touch. :) -All the Glassfish configuration tasks performed by the installer are isolated in the shell script ``dvinstall/glassfish-setup.sh`` (as ``asadmin`` commands). - -**IMPORTANT:** Please note, that "out of the box" the installer will configure the Dataverse to leave unrestricted access to the administration APIs from (and only from) localhost. Please consider the security implications of this arrangement (anyone with shell access to the server can potentially mess with your Dataverse). An alternative solution would be to block open access to these sensitive API endpoints completely; and to only allow requests supplying a pre-defined "unblock token" (password). If you prefer that as a solution, please consult the supplied script ``post-install-api-block.sh`` for examples on how to set it up. - Logging In ---------- diff --git a/doc/sphinx-guides/source/installation/prep.rst b/doc/sphinx-guides/source/installation/prep.rst index 9662b5c40b6..9f39cf74063 100644 --- a/doc/sphinx-guides/source/installation/prep.rst +++ b/doc/sphinx-guides/source/installation/prep.rst @@ -14,6 +14,13 @@ We'll try to get you up and running as quickly as possible, but we thought you m Choose Your Own Installation Adventure -------------------------------------- +NDS Labs Workbench (for Testing Only) ++++++++++++++++++++++++++++++++++++++ + +The National Data Service (NDS) is community-driven effort guided by the National Data Service Consortium. NDS Labs has packaged Dataverse as `one of many data management tools `_ that can be quickly deployed for evaluation purposes in their tool based on Kubernetes called NDS Labs Workbench. To get started, visit http://www.nationaldataservice.org/projects/labs.html . + +Please note that the version of Dataverse in NDS Labs Workbench may lag behind the latest release. Craig Willis from NDS Labs did an excellent job of adding Dataverse 4 to NDS Labs Workbench and the Dataverse team hopes to some day take over the creation of Docker images so the latest version of Dataverse can be evaluated in the workbench. + Vagrant (for Testing Only) ++++++++++++++++++++++++++ @@ -51,6 +58,9 @@ Architecture and Components Dataverse is a Java Enterprise Edition (EE) web application that is shipped as a war (web archive) file. +Required Components ++++++++++++++++++++ + When planning your installation you should be aware of the following components of the Dataverse architecture: - Linux: RHEL/CentOS is highly recommended since all development and QA happens on this distribution. @@ -60,9 +70,12 @@ When planning your installation you should be aware of the following components - SMTP server: for sending mail for password resets and other notifications. - Persistent identifier service: DOI and Handle support are provided. Production use requires a registered DOI or Handle.net authority. +Optional Components ++++++++++++++++++++ + There are a number of optional components you may choose to install or configure, including: -- R, rApache, Zelig, and TwoRavens: :doc:`/user/data-exploration/tworavens` describes the feature and :doc:`r-rapache-tworavens` describes how to install these components. +- R, rApache, Zelig, and TwoRavens: :doc:`/user/data-exploration/tworavens` describes the feature and :doc:`r-rapache-tworavens` describes how to install these components. :doc:`external-tools` explains how third-party tools like TwoRavens can be added to Dataverse. - Dropbox integration: for uploading files from the Dropbox API. - Apache: a web server that can "reverse proxy" Glassfish applications and rewrite HTTP traffic. - Shibboleth: an authentication system described in :doc:`shibboleth`. Its use with Dataverse requires Apache. diff --git a/doc/sphinx-guides/source/installation/prerequisites.rst b/doc/sphinx-guides/source/installation/prerequisites.rst index db3573bff8b..f4955e6ec57 100644 --- a/doc/sphinx-guides/source/installation/prerequisites.rst +++ b/doc/sphinx-guides/source/installation/prerequisites.rst @@ -2,13 +2,20 @@ Prerequisites ============= -Before running the Dataverse installation script, you must install and configure the following software, preferably on a Linux distribution such as RHEL or CentOS. After following all the steps below (which are mostly based on CentOS 6), you can proceed to the :doc:`installation-main` section. +Before running the Dataverse installation script, you must install and configure the following software. + +After following all the steps below, you can proceed to the :doc:`installation-main` section. You **may** find it helpful to look at how the configuration is done automatically by various tools such as Vagrant, Puppet, or Ansible. See the :doc:`prep` section for pointers on diving into these scripts. .. contents:: |toctitle| :local: +Linux +----- + +We assume you plan to run Dataverse on Linux and we recommend RHEL/CentOS, which is the Linux distribution tested by the Dataverse development team. Please be aware that while el7 (RHEL/CentOS 7) is the recommended platform, the steps below were orginally written for el6 and may need to be updated (please feel free to make a pull request!). + Java ---- @@ -21,13 +28,13 @@ Dataverse should run fine with only the Java Runtime Environment (JRE) installed The Oracle JDK can be downloaded from http://www.oracle.com/technetwork/java/javase/downloads/index.html -On a Red Hat and similar Linux distributions, install OpenJDK with something like:: +On a RHEL/CentOS, install OpenJDK (devel version) using yum:: # yum install java-1.8.0-openjdk-devel If you have multiple versions of Java installed, Java 8 should be the default when ``java`` is invoked from the command line. You can test this by running ``java -version``. -On Red Hat/CentOS you can make Java 8 the default with the ``alternatives`` command, having it prompt you to select the version of Java from a list:: +On RHEL/CentOS you can make Java 8 the default with the ``alternatives`` command, having it prompt you to select the version of Java from a list:: # alternatives --config java @@ -38,7 +45,7 @@ If you don't want to be prompted, here is an example of the non-interactive invo Glassfish --------- -Glassfish Version 4.1 is required. There are known issues with Glassfish 4.1.1 as chronicled in https://github.com/IQSS/dataverse/issues/2628 so it should be avoided until that issue is resolved. +Glassfish Version 4.1 is required. There are known issues with newer versions of the Glassfish 4.x series so it should be avoided. For details, see https://github.com/IQSS/dataverse/issues/2628 . The issue we are using the track support for Glassfish 5 is https://github.com/IQSS/dataverse/issues/4248 . Installing Glassfish ==================== @@ -73,7 +80,7 @@ Once Glassfish is installed, you'll need a newer version of the Weld library (v2 # vim /usr/local/glassfish4/glassfish/domains/domain1/config/domain.xml -This recommendation comes from http://blog.c2b2.co.uk/2013/07/glassfish-4-performance-tuning.html among other places. +This recommendation comes from http://www.c2b2.co.uk/middleware-blog/glassfish-4-performance-tuning-monitoring-and-troubleshooting.php among other places. - Start Glassfish and verify the Weld version:: @@ -86,7 +93,7 @@ Launching Glassfish on system boot The Dataverse installation script will start Glassfish if necessary, but you may find the following scripts helpful to launch Glassfish start automatically on boot. - This :download:`Systemd file<../_static/installation/files/etc/systemd/glassfish.service>` may be serve as a reference for systems using Systemd (such as RHEL/CentOS 7 or Ubuntu 16+) -- This :download:`init script<../_static/installation/files/etc/init.d/glassfish.init.service>` may be useful for RHEL/CentOS6 or Ubuntu >= 14 if you're using a Glassfish service account, or +- This :download:`init script<../_static/installation/files/etc/init.d/glassfish.init.service>` may be useful for RHEL/CentOS 6 or Ubuntu >= 14 if you're using a Glassfish service account, or - This :download:`Glassfish init script <../_static/installation/files/etc/init.d/glassfish.init.root>` may be helpful if you're just going to run Glassfish as root. It is not necessary for Glassfish to be running before you execute the Dataverse installation script; it will start Glassfish for you. @@ -101,18 +108,16 @@ Installing PostgreSQL Version 9.x is required. Previous versions have not been tested. -The version that ships with RHEL 6 and above is fine:: +The version that ships with el7 and above is fine:: # yum install postgresql-server # service postgresql initdb # service postgresql start -The standard init script that ships RHEL 6 and similar should work fine. Enable it with this command:: +The standard init script that ships with el7 should work fine. Enable it with this command:: # chkconfig postgresql on - - Configuring Database Access for the Dataverse Application (and the Dataverse Installer) ======================================================================================= @@ -181,9 +186,12 @@ With the Dataverse-specific schema in place, you can now start Solr:: Solr Init Script ================ -The command above will start Solr in the foreground which is good for a quick sanity check that Solr accepted the schema file, but starting Solr with an init script is recommended. You can attempt to adjust this :download:`Solr init script <../_static/installation/files/etc/init.d/solr>` for your needs or write your own. +The command above will start Solr in the foreground which is good for a quick sanity check that Solr accepted the schema file, but letting the system start Solr automatically is recommended. + +- This :download:`Solr Systemd file<../_static/installation/files/etc/systemd/solr.service>` will launch Solr on boot as the solr user for RHEL/CentOS 7 or Ubuntu 16+ systems, or +- For systems using init.d, you may attempt to adjust this :download:`Solr init script <../_static/installation/files/etc/init.d/solr>` for your needs or write your own. -Solr should be running before the installation script is executed. +Solr should be running before the Dataverse installation script is executed. Securing Solr ============= diff --git a/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst b/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst index 4e394461e26..d0ab1544aa8 100644 --- a/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst +++ b/doc/sphinx-guides/source/installation/r-rapache-tworavens.rst @@ -136,11 +136,15 @@ Can be installed with :fixedwidthplain:`yum`:: yum install R R-devel -EPEL distribution recommended; version 3.3.2 is **strongly** recommended. +EPEL distribution recommended; version 3.3.2 is **strongly** recommended. Note that R 3.3.2 comes from EPEL on el6 but R 3.4.2 comes from EPEL on el7. If :fixedwidthplain:`yum` isn't configured to use EPEL repositories ( https://fedoraproject.org/wiki/EPEL ): -CentOS users can install the RPM :fixedwidthplain:`epel-release`. For RHEL/CentOS 6:: +RHEL/CentOS users can install the RPM :fixedwidthplain:`epel-release`. For RHEL/CentOS 7:: + + yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm + +RHEL/CentOS users can install the RPM :fixedwidthplain:`epel-release`. For RHEL/CentOS 6:: yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm @@ -310,25 +314,19 @@ Compare the two files. **It is important that the two copies are identical**. *(Yes, this is a HACK! We are working on finding a better way to ensure this compatibility between TwoRavens and Dataverse!)* -e. Enable TwoRavens' Explore Button in Dataverse ------------------------------------------------- +e. Enable TwoRavens Button in Dataverse +--------------------------------------- -Now that you have installed TwoRavens, the following must be done in order to -integrate it with your Dataverse. +Now that you have installed TwoRavens, you can make it available to your users by adding it an "external tool" for your Dataverse installation. (For more on external tools in general, see the :doc:`external-tools` section.) -First, enable the Data Explore option:: +First, download :download:`twoRavens.json <../_static/installation/files/root/external-tools/twoRavens.json>` as a starting point and edit ``toolUrl`` in that external tool manifest file to be the URL where you want TwoRavens to run. This is the URL reported by the installer script (as in the example at the end of step ``c.``, above). - curl -X PUT -d true http://localhost:8080/api/admin/settings/:TwoRavensTabularView - -Once enabled, the 'Explore' button will appear next to ingested tabular data files; clicking it will redirect -the user to the instance of TwoRavens, initialized with the data variables from the selected file. - -Then, the TwoRavens URL must be configured in the settings of your Dataverse application - so that it knows where to redirect the user. -This can be done by issuing the following API call:: +Once you have made your edits, make the tool available within Dataverse with the following curl command (assuming ``twoRavens.json`` is in your current working directory): - curl -X PUT -d {TWORAVENS_URL} http://localhost:8080/api/admin/settings/:TwoRavensUrl +``curl -X POST -H 'Content-type: application/json' --upload-file twoRavens.json http://localhost:8080/api/admin/externalTools`` -where :fixedwidthplain:`{TWORAVENS_URL}` is the URL reported by the installer script (as in the example at the end of step ``c.``, above). +Once enabled, an "Explore" dropdown will appear next to ingested tabular data files a "TwoRavens" button; clicking it will redirect +the user to the instance of TwoRavens, initialized with the data variables from the selected file. f. Perform a quick test of TwoRavens functionality -------------------------------------------------- diff --git a/doc/sphinx-guides/source/installation/shibboleth.rst b/doc/sphinx-guides/source/installation/shibboleth.rst index a883cef2826..0b14c93a97c 100644 --- a/doc/sphinx-guides/source/installation/shibboleth.rst +++ b/doc/sphinx-guides/source/installation/shibboleth.rst @@ -23,7 +23,7 @@ System Requirements Support for Shibboleth in Dataverse is built on the popular `"mod_shib" Apache module, "shibd" daemon `_, and the `Embedded Discovery Service (EDS) `_ Javascript library, all of which are distributed by the `Shibboleth Consortium `_. EDS is bundled with Dataverse, but ``mod_shib`` and ``shibd`` must be installed and configured per below. -Only Red Hat Enterprise Linux (RHEL) 6 and derivatives such as CentOS have been tested (x86_64 versions) by the Dataverse team. Newer versions of RHEL and CentOS **should** work but you'll need to adjust the yum repo config accordingly. See https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPLinuxInstall for details and note that (according to that page) as of this writing Ubuntu and Debian are not offically supported by the Shibboleth project. +Only Red Hat Enterprise Linux (RHEL) and derivatives such as CentOS have been tested (x86_64 versions) by the Dataverse team. See https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPLinuxInstall for details and note that (according to that page) as of this writing Ubuntu and Debian are not offically supported by the Shibboleth project. Install Apache ~~~~~~~~~~~~~~ @@ -46,6 +46,12 @@ This yum repo is recommended at https://wiki.shibboleth.net/confluence/display/S ``cd /etc/yum.repos.d`` +If you are running el7 (RHEL/CentOS 7): + +``wget http://download.opensuse.org/repositories/security:/shibboleth/CentOS_7/security:shibboleth.repo`` + +If you are running el6 (RHEL/CentOS 6): + ``wget http://download.opensuse.org/repositories/security:/shibboleth/CentOS_CentOS-6/security:shibboleth.repo`` Install Shibboleth Via Yum @@ -135,6 +141,12 @@ You can download a :download:`sample ssl.conf file <../_static/installation/file Note that ``/etc/httpd/conf.d/shib.conf`` and ``/etc/httpd/conf.d/shibboleth-ds.conf`` are expected to be present from installing Shibboleth via yum. +You may wish to also add a timeout directive to the ProxyPass line within ssl.conf. This is especially useful for larger file uploads as apache may prematurely kill the connection before the upload is processed. + +e.g. ``ProxyPass / ajp://localhost:8009/ timeout=600`` defines a timeout of 600 seconds. + +Try to strike a balance with the timeout setting. Again a timeout too low will impact file uploads. A timeout too high may cause additional stress on the server as it will have to service idle clients for a longer period of time. + Configure Shibboleth -------------------- diff --git a/doc/sphinx-guides/source/user/data-exploration/index.rst b/doc/sphinx-guides/source/user/data-exploration/index.rst index 708f774bb46..e6af35387b9 100755 --- a/doc/sphinx-guides/source/user/data-exploration/index.rst +++ b/doc/sphinx-guides/source/user/data-exploration/index.rst @@ -6,6 +6,8 @@ Data Exploration Guide ======================================================= +Note that the installation of Dataverse you are using may have additional or different tools configured. Developers interested in creating tools should refer to the :doc:`/installation/external-tools` section of the Installation Guide. + Contents: .. toctree:: diff --git a/doc/sphinx-guides/source/user/data-exploration/worldmap.rst b/doc/sphinx-guides/source/user/data-exploration/worldmap.rst index fe6be851be8..fe2a6e25785 100644 --- a/doc/sphinx-guides/source/user/data-exploration/worldmap.rst +++ b/doc/sphinx-guides/source/user/data-exploration/worldmap.rst @@ -9,7 +9,7 @@ WorldMap: Geospatial Data Exploration Dataverse and WorldMap ====================== -`WorldMap `_ is developed by the Center for Geographic Analysis (CGA) at Harvard and is open source software that helps researchers visualize and explore their data in maps. The WorldMap and Dataverse collaboration allows researchers to upload shapefiles or tabular files to Dataverse for long term storage and receive a persistent identifier (through DOI), then easily navigate into WorldMap to interact with the data and save to WorldMap as well. +`WorldMap `_ is developed by the Center for Geographic Analysis (CGA) at Harvard and is open source software that helps researchers visualize and explore their data in maps. The WorldMap and Dataverse collaboration allows researchers to upload shapefiles or tabular files to Dataverse for long term storage and receive a persistent identifier (through DOI), then easily navigate into WorldMap to interact with the data. Note: WorldMap hosts their own `user guide `_ that covers some of the same material as this page. @@ -33,11 +33,15 @@ Once you have uploaded your .zip shapefile, a Map Data button will appear next t To get started with visualizing your shapefile, click on the blue "Visualize on WorldMap" button in Geoconnect. It may take up to 45 seconds for the data to be sent to WorldMap and then back to Geoconnect. -Once this process has finished, you will be taken to a new page where you can style your map through Attribute, Classification Method, Number of Intervals, and Colors. Clicking "View on WorldMap" will open WorldMap in a new tab, allowing you to see how your map will be displayed there. +Once this process has finished, you will be taken to a new page where you can style your map through Attribute, Classification Method, Number of Intervals, and Colors. Clicking "Apply Changes" will send your map to both Dataverse and WorldMap, creating a preview of your map that will be visible on your file page and your dataset page. -After styling your map, you can either save it by clicking "Return to Dataverse" or delete it with the "Delete" button. If you decide to delete the map, it will no longer appear on WorldMap. Returning to Dataverse will send the styled map layer to both Dataverse and WorldMap. A preview of your map will now be visible on your file page and your dataset page. +Clicking "View on WorldMap" will open WorldMap in a new tab, allowing you to see how your map will be displayed there. -To replace your shapefile's map with a new one, simply click the Map Data button again. +You can delete your map with the "Delete" button. If you decide to delete the map, it will no longer appear on WorldMap, and your dataset in Dataverse will no longer display the map preview. + +When you're satisfied with your map, you may click "Return to the Dataverse" to go back to Dataverse. + +In the future, to replace your shapefile's map with a new one, simply click the Map Data button on the dataset or file page to return to the Geoconnect edit map page. Mapping tabular files with Geoconnect ===================================== @@ -121,9 +125,9 @@ Now that you have created your map: - Dataverse will contain a preview of the map and links to the larger version on WorldMap. -The map editor (pictured above) provides a set of options you can use to style your map. The "Return to the Dataverse" button saves your map and brings you back to Dataverse. "View on WorldMap" takes you to the map's page on WorldMap, which offers additional views and options. +The map editor (pictured above) provides a set of options you can use to style your map. Clicking "Apply Changes" saves the current version of your map to Dataverse and Worldmap. The "Return to the Dataverse" button brings you back to Dataverse. "View on WorldMap" takes you to the map's page on WorldMap, which offers additional views and options. -If you'd like to make future changes to your map, you can return to the editor by clicking the "Map Data" button on your file. +If you'd like to make further changes to your map in the future, you can return to the editor by clicking the "Map Data" button on your file. Removing your map ================= diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index cf6465e3e55..8c13abaf7b4 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -60,7 +60,7 @@ The file types listed in the following sections are supported by additional func Tabular Data Files ------------------ -Files in certain formats - Stata, SPSS, R, Excel(xlsx) and CSV - may be ingested as tabular data (see "Tabular Data Ingest" section for details). Tabular data files can be further explored and manipulated with `TwoRavens <../user/data-exploration/tworavens.html>`_ - a statistical data exploration application integrated with Dataverse. It allows the user to run statistical models, view summary statistics, download subsets of variable vectors and more. To start, click on the "Explore" button, found next to each relevant tabular file (the application will be opened in a new window). To download subsets of variables click on the "Download" button found next to a relevant tabular file and select "Data Subset" in the dropdown menu. You will then be able to create your subset using the interface opened in a new window (this functionality is also provided by the `TwoRavens <../user/data-exploration/tworavens.html>`_ project). See the `TwoRavens documentation section <../user/data-exploration/tworavens.html>`_ for more information. +Files in certain formats - Stata, SPSS, R, Excel(xlsx) and CSV - may be ingested as tabular data (see "Tabular Data Ingest" section for details). Tabular data files can be further explored and manipulated with `TwoRavens <../user/data-exploration/tworavens.html>`_ - a statistical data exploration application integrated with Dataverse, as well as other :doc:`/installation/external-tools` if they have been enabled in the installation of Dataverse you are using. TwoRavens allows the user to run statistical models, view summary statistics, download subsets of variable vectors and more. To start, click on the "Explore" button, found next to each relevant tabular file (the application will be opened in a new window). To download subsets of variables click on the "Download" button found next to a relevant tabular file and select "Data Subset" in the dropdown menu. You will then be able to create your subset using the interface opened in a new window (this functionality is also provided by the `TwoRavens <../user/data-exploration/tworavens.html>`_ project). See the `TwoRavens documentation section <../user/data-exploration/tworavens.html>`_ for more information. For example, for the ingest functionality for tabular files in Harvard Dataverse, a file can only be up to 2GB in size. To use the ingest functionality for RData files, a file can only be up to 1MB in size. However, to upload a RData file without using ingest, a file can be up to 2GB in size. @@ -225,14 +225,14 @@ your own custom Terms of Use for your Datasets. Custom Terms of Use for Datasets -------------------------------- -If you are unable to use a CC0 waiver for your datasets you are able to set your own custom terms of use. To do so, select "No, do not apply CC0 - "Public Domain Dedication" and a Terms of Use textbox will show up allowing you to enter your own custom terms of use for your dataset. To add more information about the Terms of Use, click on "Additional Information \[+]". +If you are unable to use a CC0 waiver for your datasets you are able to set your own custom terms of use. To do so, select "No, do not apply CC0 - "Public Domain Dedication" and a Terms of Use textbox will show up allowing you to enter your own custom terms of use for your dataset. To add more information about the Terms of Use, we have provided fields like Special Permissions, Restrictions, Citation Requirements, etc. Here is an `example of a Data Usage Agreement `_ for datasets that have de-identified human subject data. Restricted Files + Terms of Access ---------------------------------- -If you restrict any files in your dataset, you will be prompted by a pop-up to enter Terms of Access for the data. This can also be edited in the Terms tab or selecting Terms in the "Edit" dropdown button in the dataset. You may also allow users to request access for your restricted files by enabling "Request Access". To add more information about the Terms of Access, click on "Additional Information \[+]". +If you restrict any files in your dataset, you will be prompted by a pop-up to enter Terms of Access for the data. This can also be edited in the Terms tab or selecting Terms in the "Edit" dropdown button in the dataset. You may also allow users to request access for your restricted files by enabling "Request Access". To add more information about the Terms of Access, we have provided fields like Data Access Place, Availability Status, Contact for Access, etc. **Note:** Some Dataverse installations do not allow for file restriction. @@ -246,6 +246,10 @@ This is where you will enable a particular Guestbook for your dataset, which is Roles & Permissions ===================== +Dataverse user accounts can be granted roles that define which actions they are allowed to take on specific dataverses, datasets, and/or files. Each role comes with a set of permissions, which define the specific actions that users may take. + +Roles and permissions may also be granted to groups. Groups can be defined as a collection of Dataverse user accounts, a collection of IP addresses (e.g. all users of a library's computers), or a collection of all users who log in using a particular institutional login (e.g. everyone who logs in with a particular university's account credentials). + Dataset-Level ------------- @@ -253,20 +257,20 @@ Admins or curators of a dataset can assign roles and permissions to the users of When you access a dataset's permissions page, you will see two sections: -**Users/Groups:** Here you can assign roles to specific users or groups of users, determining which actions they are permitted to take on your dataset. You can also reference a list of all users who have roles assigned to them for your dataset and remove their roles if you please. Some of the users listed may have roles assigned at the dataverse level, in which case those roles can only be removed from the dataverse permissions page. +**Users/Groups:** Here you can assign roles to specific users or groups, determining which actions they are permitted to take on your dataset. You can also reference a list of all users who have roles assigned to them for your dataset and remove their roles if you please. Some of the users listed may have roles assigned at the dataverse level, in which case those roles can only be removed from the dataverse permissions page. **Roles:** Here you can reference a full list of roles that can be assigned to users of your dataset. Each role lists the permissions that it offers. File-Level ---------- -If you have restricted access to specific files in your dataset, you can grant specific users or groups access to those files while still keeping them restricted to the general public. If you are an admin or curator of a dataset, then you can get to the file-level permissions page by clicking the "Edit" button, highlighting "Permissions" from the dropdown list, and clicking "File". +If specific files in your dataset are restricted access, then you can grant specific users or groups access to those files while still keeping them restricted to the general public. If you are an admin or curator of a dataset, then you can get to the file-level permissions page by clicking the "Edit" button, highlighting "Permissions" from the dropdown list, and clicking "File". When you access a dataset's file-level permissions page, you will see two sections: **Users/Groups:** Here you can see which users or groups have been granted access to which files. You can click the "Grant Access to Users/Groups" button to see a box where you can grant access to specific files within your dataset to specific users or groups. If any users have requested access to a file in your dataset, you can grant or reject their access request here. -**Restricted Files:** In this section, you can see the same information, but broken down by each individual file in your dataset. For each file, you can click the "Assign Access" button to see a box where you can grant access to that file to specific users. +**Restricted Files:** In this section, you can see the same information, but broken down by each individual file in your dataset. For each file, you can click the "Assign Access" button to see a box where you can grant access to that file to specific users or groups. .. _thumbnails-widgets: @@ -360,8 +364,6 @@ Once you edit your published dataset a new draft version of this dataset will be **Important Note:** If you add a file, your dataset will automatically be bumped up to a major version (e.g., if you were at 1.0 you will go to 2.0). -.. To get to the already published version 1 of your dataset, click on the "View Dataset Versions" button on the top left section of your dataset. To go back to the unpublished version click on the same button. - On the Versions tab of a dataset page, there is a versions table that displays the version history of the dataset. You can use the version number links in this table to navigate between the different versions of the dataset, including the unpublished draft version, if you have permission to access it. There is also a Versions tab on the file page. The versions table for a file displays the same information as the dataset, but the summaries are filtered down to only show the actions related to that file. If a new dataset version were created without any changes to an individual file, that file's version summary for that dataset version would read "No changes associated with this version". @@ -386,7 +388,7 @@ You must also include a reason as to why this dataset was deaccessioned. Select Add more information as to why this was deaccessioned in the free-text box. If the dataset has moved to a different repository or site you are encouraged to include a URL (preferably persistent) for users to continue to be able to access this dataset in the future. -If you deaccession the most recently published version of the dataset but not all versions of the dataset, you are able to go in and create a new draft for the dataset. For example, you have a version 1 and version 2 of a dataset, both published, and deaccession version 2. You are then able to edit version 1 of the dataset and a new draft vresion will be created. +If you deaccession the most recently published version of the dataset but not all versions of the dataset, you are able to go in and create a new draft for the dataset. For example, you have a version 1 and version 2 of a dataset, both published, and deaccession version 2. You are then able to edit version 1 of the dataset and a new draft version will be created. **Important Note**: A tombstone landing page with the basic citation metadata will always be accessible to the public if they use the persistent URL (Handle or DOI) provided in the citation for that dataset. Users will not be able to see any of the files or additional metadata that were previously available prior to deaccession. diff --git a/doc/sphinx-guides/source/user/dataverse-management.rst b/doc/sphinx-guides/source/user/dataverse-management.rst index be12c3b32aa..779590faf01 100755 --- a/doc/sphinx-guides/source/user/dataverse-management.rst +++ b/doc/sphinx-guides/source/user/dataverse-management.rst @@ -61,7 +61,7 @@ Tip: The metadata fields you select as required will appear on the Create Datase Theme ==================================================== -The Theme feature provides you with a way to customize the look of your dataverse. You can decide either to use the customization from the dataverse above yours or upload your own image file. Supported image types are JPEG, TIFF, or PNG and should be no larger than 500 KB. The maximum display size for an image file in a dataverse's theme is 940 pixels wide by 120 pixels high. Additionally, you can select the colors for the header of your dataverse and the text that appears in your dataverse. You can also add a link to your personal website, the website for your organization or institution, your department, journal, etc. +The Theme feature provides you with a way to customize the look of your dataverse. You can decide either to use the theme from the dataverse containing your dataverse (even up to the root dataverse, AKA the homepage), or upload your own image file. Supported image types are JPEG, TIFF, or PNG and should be no larger than 500 KB. The maximum display size for an image file in a dataverse's theme is 940 pixels wide by 120 pixels high. Additionally, you can select the colors for the header of your dataverse and the text that appears in your dataverse. You can also add a link to your personal website, the website for your organization or institution, your department, journal, etc. .. _dataverse-widgets: @@ -92,6 +92,10 @@ Adding Widgets to an OpenScholar Website Roles & Permissions ======================================================= +Dataverse user accounts can be granted roles that define which actions they are allowed to take on specific dataverses, datasets, and/or files. Each role comes with a set of permissions, which define the specific actions that users may take. + +Roles and permissions may also be granted to groups. Groups can be defined as a collection of Dataverse user accounts, a collection of IP addresses (e.g. all users of a library's computers), or a collection of all users who log in using a particular institutional login (e.g. everyone who logs in with a particular university's account credentials). + Admins of a dataverse can assign roles and permissions to the users of that dataverse. If you are an admin on a dataverse, then you will find the link to the Permissions page under the Edit dropdown on the dataverse page. |image2| @@ -104,7 +108,7 @@ When you access a dataverse's permissions page, you will see three sections: **Permissions:** Here you can decide the requirements that determine which types of users can add datasets and sub dataverses to your dataverse, and what permissions they'll be granted when they do so. -**Users/Groups:** Here you can assign roles to specific users or groups of users, determining which actions they are permitted to take on your dataverse. You can also reference a list of all users who have roles assigned to them for your dataverse and remove their roles if you please. +**Users/Groups:** Here you can assign roles to specific users or groups, determining which actions they are permitted to take on your dataverse. You can also reference a list of all users who have roles assigned to them for your dataverse and remove their roles if you please. **Roles:** Here you can reference a full list of roles that can be assigned to users of your dataverse. Each role lists the permissions that it offers. diff --git a/doc/sphinx-guides/source/user/find-use-data.rst b/doc/sphinx-guides/source/user/find-use-data.rst index 73b151ec357..05f1ce49c6f 100755 --- a/doc/sphinx-guides/source/user/find-use-data.rst +++ b/doc/sphinx-guides/source/user/find-use-data.rst @@ -72,7 +72,7 @@ Download Files Within the Files tab on a dataset page, you can download the files in that dataset. To download more than one file at a time, select the files you would like to download and then click the Download button above the files. The selected files will download in zip format. -Tabular data files offer additional options: You can explore using the TwoRavens data visualization tool by clicking the Explore button, or choose from a number of tabular-data-specific download options available as a dropdown under the Download button. +Tabular data files offer additional options: You can explore using the TwoRavens data visualization tool (or other :doc:`/installation/external-tools` if they have been enabled) by clicking the Explore button, or choose from a number of tabular-data-specific download options available as a dropdown under the Download button. .. _rsync_download: diff --git a/doc/sphinx-guides/source/user/tabulardataingest/csv.rst b/doc/sphinx-guides/source/user/tabulardataingest/csv.rst index c29708f31c9..08f67ab8b50 100644 --- a/doc/sphinx-guides/source/user/tabulardataingest/csv.rst +++ b/doc/sphinx-guides/source/user/tabulardataingest/csv.rst @@ -9,7 +9,7 @@ Ingest of Comma-Separated Values files as tabular data. Dataverse will make an attempt to turn CSV files uploaded by the user into tabular data, using the `Apache CSV parser `_. -Main formatting requirements: +Main formatting requirements ----------------------------- The first row in the document will be treated as the CSV's header, containing variable names for each column. @@ -38,12 +38,12 @@ are recognized as a **single** row with **5** comma-separated values (cells): (where ``\n`` is a new line character) -Limitations: +Limitations ------------ Compared to other formats, relatively little information about the data ("variable-level metadata") can be extracted from a CSV file. Aside from the variable names supplied in the top line, the ingest will make an educated guess about the data type of each comma-separated column. One of the supported rich file formats (Stata, SPSS and R) should be used if you need to provide more descriptive variable-level metadata (variable labels, categorical values and labels, explicitly defined data types, etc.). -Recognized data types and formatting: +Recognized data types and formatting ------------------------------------- The application will attempt to recognize numeric, string, and date/time values in the individual comma-separated columns. @@ -79,7 +79,7 @@ Any non-Latin characters are allowed in character string values, **as long as th The most immediate implication is in the calculation of the UNF signatures for the data vectors, as different normalization rules are applied to numeric, character, and date/time values. (see the :doc:`/developers/unf/index` section for more information). If it is important to you that the UNF checksums of your data are accurately calculated, check that the numeric and date/time columns in your file were recognized as such (as ``type=numeric`` and ``type=character, category=date(time)``, respectively). If, for example, a column that was supposed to be numeric is recognized as a vector of character values (strings), double-check that the formatting of the values is consistent. Remember, a single improperly-formatted value in the column will turn it into a vector of character strings, and result in a different UNF. Fix any formatting errors you find, delete the file from the dataset, and try to ingest it again. -Tab-delimited Data Files: +Tab-delimited Data Files ------------------------- Presently, tab-delimited files can be ingested by replacing the TABs with commas. diff --git a/doc/sphinx-guides/source/workflows.rst b/doc/sphinx-guides/source/workflows.rst deleted file mode 100644 index 477530ef7d1..00000000000 --- a/doc/sphinx-guides/source/workflows.rst +++ /dev/null @@ -1,92 +0,0 @@ -Workflows -========== - -Dataverse can perform two sequences of actions when datasets are published: one prior to publishing (marked by a ``PrePublishDataset`` trigger), and one after the publication has succeeded (``PostPublishDataset``). The pre-publish workflow is useful for having an external system prepare a dataset for being publicly accessed (a possibly lengthy activity that requires moving files around, uploading videos to a streaming server, etc.), or to start an approval process. A post-publish workflow might be used for sending notifications about the newly published dataset. - -Workflow steps are created using *step providers*. Dataverse ships with an internal step provider that offers some basic functionality, and with the ability to load 3rd party step providers. This allows installations to implement functionality they need without changing the Dataverse source code. - -Steps can be internal (say, writing some data to the log) or external. External steps involve Dataverse sending a request to an external system, and waiting for the system to reply. The wait period is arbitrary, and so allows the external system unbounded operation time. This is useful, e.g., for steps that require human intervension, such as manual approval of a dataset publication. - -The external system reports the step result back to dataverse, by sending a HTTP ``POST`` command to ``api/workflows/{invocation-id}``. The body of the request is passed to the paused step for further processing. - -If a step in a workflow fails, Dataverse make an effort to roll back all the steps that preceeded it. Some actions, such as writing to the log, cannot be rolled back. If such an action has a public external effect (e.g. send an EMail to a mailing list) it is advisable to put it in the post-release workflow. - -.. tip:: - For invoking external systems using a REST api, Dataverse's internal step - provider offers a step for sending and receiving customizable HTTP requests. - It's called *http/sr*, and is detailed below. - -Administration --------------- - -A Dataverse instance stores a set of workflows in its database. Workflows can be managed using the ``api/admin/workflows/`` endpoints of the :doc:`api/native-api`. Sample workflow files are available in ``scripts/api/data/workflows``. - -At the moment, defining a workflow for each trigger is done for the entire instance, using the endpoint ``api/admin/workflows/default/«trigger type»``. - -In order to prevent unauthorized resuming of workflows, Dataverse maintains a "white list" of IP addresses from which resume requests are honored. This list is maintained using the ``/api/admin/workflows/ip-whitelist`` endpoint of the :doc:`api/native-api`. By default, Dataverse honors resume requests from localhost only (``127.0.0.1;::1``), so set-ups that use a single server work with no additional configuration. - - -Available Steps ---------------- - -Dataverse has an internal step provider, whose id is ``:internal``. It offers the following steps: - -log -~~~~~~~~ -A step that writes data about the current workflow invocation to the instance log. It also writes the messages in its ``parameters`` map. - -.. code:: json - - { - "provider":":internal", - "stepType":"log", - "parameters": { - "aMessage": "message content", - "anotherMessage": "message content, too" - } - } - - -pause -~~~~~~~~ -A step that pauses the workflow. The workflow is paused until a POST request is sent to ``/api/workflows/{invocation-id}``. - -.. code:: json - - { - "provider":":internal", - "stepType":"pause" - } - - -http/sr -~~~~~~~~~ -A step that sends a HTTP request to an external system, and then waits for a response. The response has to match a regular expression specified in the step parameters. The url, content type, and message body can use data from the workflow context, using a simple markup language. This step has specific parameters for rollback. - -.. code:: json - - { - "provider":":internal", - "stepType":"http/sr", - "parameters": { - "url":"http://localhost:5050/dump/${invocationId}", - "method":"POST", - "contentType":"text/plain", - "body":"START RELEASE ${dataset.id} as ${dataset.displayName}", - "expectedResponse":"OK.*", - "rollbackUrl":"http://localhost:5050/dump/${invocationId}", - "rollbackMethod":"DELETE ${dataset.id}" - } - } - -Available variables are: - -* ``invocationId`` -* ``dataset.id`` -* ``dataset.identifier`` -* ``dataset.globalId`` -* ``dataset.displayName`` -* ``dataset.citation`` -* ``minorVersion`` -* ``majorVersion`` -* ``releaseStatus`` diff --git a/local_lib/com/lyncode/xoai/4.1.0-header-patch/xoai-4.1.0-header-patch.pom b/local_lib/com/lyncode/xoai/4.1.0-header-patch/xoai-4.1.0-header-patch.pom index 9e0d802244c..89a14d88c51 100644 --- a/local_lib/com/lyncode/xoai/4.1.0-header-patch/xoai-4.1.0-header-patch.pom +++ b/local_lib/com/lyncode/xoai/4.1.0-header-patch/xoai-4.1.0-header-patch.pom @@ -199,7 +199,7 @@ xalan xalan - 2.7.0 + 2.7.2 dom4j diff --git a/pom.xml b/pom.xml index 5575054b2ff..e1b33695aff 100644 --- a/pom.xml +++ b/pom.xml @@ -4,7 +4,7 @@ edu.harvard.iq dataverse - 4.8.1 + 4.8.5 war dataverse @@ -125,7 +125,7 @@ commons-fileupload commons-fileupload - 1.3.1 + 1.3.3 com.google.code.gson @@ -304,12 +304,12 @@ org.ocpsoft.rewrite rewrite-servlet - 2.0.12.Final + 3.4.2.Final org.ocpsoft.rewrite rewrite-config-prettyfaces - 2.0.12.Final + 3.4.2.Final edu.ucsb.nceas @@ -320,7 +320,7 @@ org.jsoup jsoup - 1.8.1 + 1.8.3 com.jayway.restassured diff --git a/scripts/api/data/authentication-providers/echo.json b/scripts/api/data/authentication-providers/base-oauth.json similarity index 100% rename from scripts/api/data/authentication-providers/echo.json rename to scripts/api/data/authentication-providers/base-oauth.json diff --git a/scripts/api/data/authentication-providers/base-oauth2.json b/scripts/api/data/authentication-providers/base-oauth2.json deleted file mode 100644 index 177fd12a023..00000000000 --- a/scripts/api/data/authentication-providers/base-oauth2.json +++ /dev/null @@ -1,8 +0,0 @@ -{ - "id":"echo-dignified", - "factoryAlias":"Echo", - "title":"Dignified Echo provider", - "subtitle":"Approves everyone, based on their credentials, and adds some flair", - "factoryData":"Sir,Esq.", - "enabled":true -} diff --git a/scripts/api/data/authentication-providers/orcid-sandbox.json b/scripts/api/data/authentication-providers/orcid-sandbox.json index 7f017add1e6..3a1c311fff4 100644 --- a/scripts/api/data/authentication-providers/orcid-sandbox.json +++ b/scripts/api/data/authentication-providers/orcid-sandbox.json @@ -1,8 +1,8 @@ { - "id":"orcid-sandbox", + "id":"orcid-v2-sandbox", "factoryAlias":"oauth2", "title":"ORCID Sandbox", - "subtitle":"ORCiD - sandbox", - "factoryData":"type: orcid | userEndpoint: https://api.sandbox.orcid.org/v1.2/{ORCID}/orcid-profile | clientId: APP-HIV99BRM37FSWPH6 | clientSecret: ee844b70-f223-4f15-9b6f-4991bf8ed7f0", + "subtitle":"ORCiD - sandbox (v2)", + "factoryData":"type: orcid | userEndpoint: https://api.sandbox.orcid.org/v2.0/{ORCID}/person | clientId: APP-HIV99BRM37FSWPH6 | clientSecret: ee844b70-f223-4f15-9b6f-4991bf8ed7f0", "enabled":true } diff --git a/scripts/api/setup-optional-harvard.sh b/scripts/api/setup-optional-harvard.sh index 1763fc33adf..92055bd8ee9 100755 --- a/scripts/api/setup-optional-harvard.sh +++ b/scripts/api/setup-optional-harvard.sh @@ -31,12 +31,12 @@ curl -X PUT -d true "$SERVER/admin/settings/:ScrubMigrationData" echo "- Enabling Shibboleth" curl -X POST -H "Content-type: application/json" http://localhost:8080/api/admin/authenticationProviders --upload-file ../../doc/sphinx-guides/source/_static/installation/files/etc/shibboleth/shibAuthProvider.json echo "- Enabling TwoRavens" -curl -s -X PUT -d true "$SERVER/admin/settings/:TwoRavensTabularView" +curl -X POST -H 'Content-type: application/json' --upload-file ../../doc/sphinx-guides/source/_static/installation/files/root/external-tools/twoRavens.json http://localhost:8080/api/admin/externalTools echo "- Enabling Geoconnect" curl -s -X PUT -d true "$SERVER/admin/settings/:GeoconnectCreateEditMaps" curl -s -X PUT -d true "$SERVER/admin/settings/:GeoconnectViewMaps" echo "- Setting system email" -curl -X PUT -d "Harvard Dataverse Support " http://localhost:8080/api/admin/settings/:SystemEmail +curl -X PUT -d "Harvard Dataverse Support " http://localhost:8080/api/admin/settings/:SystemEmail curl -X PUT -d ", The President & Fellows of Harvard College" http://localhost:8080/api/admin/settings/:FooterCopyright echo "- Setting up the Harvard Shibboleth institutional group" curl -s -X POST -H 'Content-type:application/json' --upload-file data/shibGroupHarvard.json "$SERVER/admin/groups/shib?key=$adminKey" diff --git a/scripts/backup/run_backup/README_HOWTO.txt b/scripts/backup/run_backup/README_HOWTO.txt new file mode 100644 index 00000000000..2e2a0a84e89 --- /dev/null +++ b/scripts/backup/run_backup/README_HOWTO.txt @@ -0,0 +1,205 @@ +Introduction +============ + +The script, run_backup.py is run on schedule (by a crontab, most +likely). It will back up the files stored in your Dataverse on a +remote storage system. + +As currently implemented, the script can read Dataverse files stored +either on the filesystem or S3; and back them up on a remote storage +server via ssh/scp. It can be easily expanded to support other storage +and backup types (more information is provided below). + +Requirements +============ + +The backup script is written in Python. It was tested with Python v. 2.6 and 2.7. +The following extra modules are required: + +psycopg2 [2.7.3.2] - PostgreSQL driver +boto3 [1.4.7] - AWS sdk, for accessing S3 storage +paramiko [2.2.1] - SSH client, for transferring files via SFTP + +(see below for the exact versions tested) + +Also, an incomplete implementation for backing up files on a remote +swift node is provided. To fully add swift support (left as an +exercise for the reader) an additional module, swiftclient will be +needed. + +Test platforms: + +MacOS 10.12 +----------- + +Python: 2.7.2 - part of standard distribution +paramiko: 2.2.1 - standard +psycopg2: 2.7.3.2 - built with "pip install psycopg2" +boto3: 1.4.7 - built with "pip install boto3" + +CentOS 6 +-------- + +Python: 2.6.6 (from the base distribution for CentOS 6; default /usr/bin/python) +paramiko: 1.7.5 (base distribution) + +distributed as an rpm, python-paramiko.noarch, via the yum repo "base". +if not installed: + yum install python-paramiko + +psycopg2: 2.0.14 (base distribution) +distributed as an rpm, python-psycopg2.x86_64, via the yum repo "base". +if not installed: + yum install python-psycopg2 + +boto3: 1.4.8 (built with "pip install boto3") + +- quick and easy build; +make sure you have pip installed. ("yum install python-pip", if not) + +NOTE: v. 2.6 of Python is considered obsolete; the only reason we are +using it is that it is the default version that comes with an equally +obsolete distribution v.6 of CentOS; which just happened to be what we +had available to test this setup on. Similarly, the versions of +paramiko and psycopg2, above, are quite old too. But everything +appears to be working. + +CentOS 7: +--------- + +(TODO) + + +Usage +===== + +In the default mode, the script will attempt to retrieve and back up +only the files that have been created in the Dataverse since the +createdate timestamp on the most recent file already in the backup +database; or all the files, if this is the first run (see the section +below on what the "backup databse" is). + +When run with the "--rerun" option (python run_backup.py --rerun) the +script will retrieve the list of ALL the files currently in the +dataverse, but will only attempt to back up the ones not yet backed up +successfully. (i.e. it will skip the files already in the backup +database with the 'OK' backup status) + + +Configuration +============= + +Access credentials, for the Dataverse +and the remote storage system are configured in the file config.ini. + +The following config.ini sections must be configured for the +whole thing to work: + +1. Database. + +The script needs to be able to access the Dataverse database, in order to +obtain the lists of files that have changed since the last backup and +need to be copied. The script can use PostgreSQL running on a +remote server. Just make sure that the remote server is configured to +allow connections from the host running the backup script; and that +PostgreSQL is allowing database access from this host too. + +Configure the access credentials as in the example below: + +[Database] +Host: localhost +Port: 5432 +Database: dvndb +Username: dvnapp +Password: xxxxx + +In addition to the main Dataverse database, the script maintains its +own database for keeping track of the backup status of individual +files. The name of the database is specified in the following setting: + +BackupDatabase: backupdb + +The database must be created prior to running of the script. For +example, on the command line: + createdb -U postgres backupdb --owner=dvnapp + +NOTE that the current assumption is that this Postgres database lives +on the same server as the main Dataverse database and is owned by the +same user. + +Also, one table must be created *in this database* (NOT in the main +Dataverse database) before the script can be run. The script +backupdb.sql is provided in this directory. NOTE that the Postgres +user name dvnapp is hard-coded in the script; change it to reflect the +name of the database user on your system, if necessary. + +You can use the standard psql command to create the table; for example: + + psql -d backupdb -f backupdb.sql + +(please note that the example above assumes "backupdb" as the name of +the backup database) + +2. Repository + +This section configures access to the datafiles stored in your +Dataverse. In its present form, the script can read files stored on +the filesystem and S3. There is no support for reading files stored +via swift as of yet. Adding swift support should be straightforward, +by supplying another storage module - similarly to the existing +storage_filesystem.py and storage_s3.py. If you'd like to work on +this, please get in touch. + +For the filesystem storage: the assumption is that the script has +direct access to the filesystem where the files live. Meaning that in +order for the script to work on a server that's different from the one +running the Dataverse application, the filesystem must be readable by +the server via NFS, or similarly shared with it. + +The filesystem access requires the single configuration setting, as in +the example below: + +[Repository] +FileSystemDirectory: /usr/local/glassfish4/glassfish/domains/domain1/files + +For S3, no configuration is needed in the config.ini. But AWS +access must be properly configured for the user running the backup +module, in the standard ~/.aws location. + + +3. Backup section. + +This section specifies the method for storing the files on the remote +("secondary") storage subsystem: + +[Backup] +StorageType: ssh + +The currently supported methods are "ssh" (the files are transferred +to the remote location via SSH/SFTP) and "swift" (untested, and +possibly incomplete implementation is provided; see +README_IMPLEMENTATION.txt for more details). + +For ssh access, the following configuration entries are needed: + +SshHost: yyy.zzz.edu +SshPort: 22 +SshUsername: xxxxx + +Additionally, SSH access to the remote server (SshHost, above) must be +provided for the user specified (SshUsername) via ssh keys. + +4. Email notifications + +Once the script completes a backup run it will send a (very minimal) +status report to the email address specified in the config.ini file; +for example: + +[Notifications] +Email: xxx@yyy.zzz.edu + +As currently implemented, the report will only specify how many files +have been processed, and how many succeeded or failed. In order to get +more detailed information about the individual files you'll need to +consult the datafilestatus table in the backup database. + diff --git a/scripts/backup/run_backup/README_IMPLEMENTATION.txt b/scripts/backup/run_backup/README_IMPLEMENTATION.txt new file mode 100644 index 00000000000..7e78e0eb57b --- /dev/null +++ b/scripts/backup/run_backup/README_IMPLEMENTATION.txt @@ -0,0 +1,102 @@ +The backup script is implemented in Python (developed and tested with +v. 2.7.10). The following extra modules are needed: + +(versions tested as of the writing of this doc, 11.14.2017) + +psycopg2 [2.7.3.2] - PostgresQL driver +boto3 [1.4.7] - AWS sdk, for accessing S3 storage +paramiko [2.2.1] - SSH client, for transferring files via SFTP +swiftclient [2.7.0] - for reading [not yet implemented] and writing [incomplete implementation provided] swift objects. + +1. Database access. + +The module uses psycopg2 to access the Dataverse database, to obtain +the lists of files that have changed since the last backup that need +to be copied over. Additionally, it maintains its own database for +keeping track of the backup status of individual files. As of now, +this extra database must reside on the same server as the main +Dataverse database and is owned by the same Postgres user. + +Consult README_HOWTO.txt on how to set up this backup database (needs +to be done prior to running the backup script) + +2. Storage access + +Currently implemented storage access methods, for local filesystem and +S3 are isolated in the files storage_filesystem.py and storage_s3.py, +respectively. To add support for swift a similar fragment of code will +need to be provided, with an open_storage_object... method that can go +to the configured swift end node and return the byte stream associated +with the datafile. Use storage_filesystem.py as the model. Then the +top-level storage.py class will need to be modified to import and use +the extra storage method. + +3. Backup (write) access. + +Similarly, storage type-specific code for writing backed up objects is +isolated in the backup_...py files. The currently implemented storage +methods are ssh/ftp (backup_ssh.py, default) and swift +(backup_swift.py; experimental, untested). To add support for other +storage systems, use backup_ssh.py as the model to create your own +backup_... classes, implementing similar methods, that a) copy the +byte stream associated with a Dataverse datafile onto this storage +system and b) verify the copy against the checksum (MD5 or SHA1) +provided by the Dataverse. In the SSH/SFTP implementation, we can do +the verification step by simply executing md5sum/sha1sum on the remote +server via ssh, once the file is copied. With swift, the only way to +verify against the checksum is to read the file *back* from the swift +end note, and calculate the checksum on the obtained stream. + +4. Keeping track of the backup status + +The module uses the table datafilestatus in the "backup database" to +maintain the backup status information for the individual +datafiles. For the successfully backed up files the 'OK' status is +stored. If the module fails to read the file from the Dataverse +storage, the status 'FAIL_READ' is stored; if it fails to copy over or +verify the backup copy against the checksum, the status 'FAIL_WRITE' +is stored. The Dataverse "createdate" timestamp of the Datafile is +also stored in the database; this way, for incremental backups, the +script tries to retrieve only the Datafiles created after the latest +createdate timestamp currently in the backup db. + +5. TODOs + + +As currently implemented, the status notification report will only +specify how many files have been processed, and how many succeeded or +failed. In order to get more detailed information about the individual +files you'll need to consult the datafilestatus table in the backup +database. + +It could be useful to perhaps extend it to provide a list of specific +files that have been backed up successfully or failed. + +Note that the script relies on the *nix 'mail' command to send the +email notification. I chose to do it this way because it felt easier +than to require the user to configure which smtp server to use in +order to send it from python code... But this requires the mail +command to be there, and the system configured to be able to send +email from the command line. + +If for whatever reason this is not an option, and mail needs to be +sent via remote SMTP, the provided email_notification.py could be +easily modified to use something like + + +import smtplib +from email.mime.text import MIMEText + +... + +msg = MIMEText(text) + +msg['Subject'] = subject_str +msg['To'] = ConfigSectionMap("Notifications")['email'] + +... + +s = smtplib.SMTP(ConfigSectionMap("Notifications")['smtpserver']) +s.sendmail(from, ConfigSectionMap("Notifications")['email'], msg.as_string()) +s.quit() + diff --git a/scripts/backup/run_backup/backup.py b/scripts/backup/run_backup/backup.py new file mode 100644 index 00000000000..6004f212c67 --- /dev/null +++ b/scripts/backup/run_backup/backup.py @@ -0,0 +1,17 @@ +import io +import re +#import backup_swift #TODO +from backup_ssh import (backup_file_ssh) +from config import (ConfigSectionMap) + +def backup_file (file_input, dataset_authority, dataset_identifier, storage_identifier, checksum_type, checksum_value, file_size): + storage_type = ConfigSectionMap("Backup")['storagetype'] + + if storage_type == 'swift': + #backup_file_swift(file_input, dataset_authority, dataset_identifier, storage_identifier, checksum_type, checksum_value, file_size) + raise NotImplementedError('no backup_swift yet') + elif storage_type == 'ssh': + backup_file_ssh(file_input, dataset_authority, dataset_identifier, storage_identifier, checksum_type, checksum_value, file_size) + else: + raise ValueError("only ssh/sftp and swift are supported as backup storage media") + diff --git a/scripts/backup/run_backup/backup_ssh.py b/scripts/backup/run_backup/backup_ssh.py new file mode 100644 index 00000000000..3355b9cffb2 --- /dev/null +++ b/scripts/backup/run_backup/backup_ssh.py @@ -0,0 +1,149 @@ +# Dataverse backup, ssh io module + +import sys +import io +import paramiko +import os +import re +from config import (ConfigSectionMap) + +my_ssh_client = None + +def open_ssh_client(): + ssh_host = ConfigSectionMap("Backup")['sshhost'] + ssh_port = ConfigSectionMap("Backup")['sshport'] + ssh_username = ConfigSectionMap("Backup")['sshusername'] + + print "SSH Host: %s" % (ssh_host) + print "SSH Port: %s" % (ssh_port) + print "SSH Username: %s" % (ssh_username) + + + ssh_client=paramiko.SSHClient() + ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + ssh_client.connect(hostname=ssh_host,username=ssh_username) + + print "Connected!" + + return ssh_client + +# Transfers the file "local_flo" over ssh/sftp to the configured remote server. +# local_flo can be either a string specifying the file path, or a file-like object (stream). +# Note that if a stream is supplied, the method also needs the file size to be specified, +# via the parameter byte_size. +def transfer_file(local_flo, dataset_authority, dataset_identifier, storage_identifier, byte_size): + sftp_client=my_ssh_client.open_sftp() + + remote_dir = dataset_authority + "/" + dataset_identifier + + subdirs = remote_dir.split("/") + + cdir = ConfigSectionMap("Backup")['backupdirectory'] + "/" + for subdir in subdirs: + try: + cdir = cdir + subdir + "/" + sftpattr=sftp_client.stat(cdir) + except IOError: + #print "directory "+cdir+" does not exist (creating)" + sftp_client.mkdir(cdir) + #else: + # print "directory "+cdir+" already exists" + + m = re.search('^([a-z0-9]*)://(.*)$', storage_identifier) + if m is not None: + storageTag = m.group(1) + storage_identifier = re.sub('^.*:', '', storage_identifier) + + remote_file = cdir + storage_identifier + + if (type(local_flo) is str): + sftp_client.put(local_flo,remote_file) + else: + # assume it's a stream: + # sftp_client.putfo() is convenient, but appears to be unavailable in older + # versions of paramiko; so we'll be using .read() and .write() instead: + #sftp_client.putfo(local_flo,remote_file,byte_size) + sftp_stream = sftp_client.open(remote_file,"wb") + while True: + buffer = local_flo.read(32*1024) + if len(buffer) == 0: + break; + sftp_stream.write (buffer) + sftp_stream.close() + + sftp_client.close() + + print "File transfered." + + return remote_file + +def verify_remote_file(remote_file, checksum_type, checksum_value): + try: + stdin,stdout,stderr=my_ssh_client.exec_command("ls "+remote_file) + remote_file_checked = stdout.readlines()[0].rstrip("\n\r") + except: + raise ValueError("remote file check failed (" + remote_file + ")") + + if (remote_file != remote_file_checked): + raise ValueError("remote file NOT FOUND! (" + remote_file_checked + ")") + + if (checksum_type == "MD5"): + remote_command = "md5sum" + elif (checksum_type == "SHA1"): + remote_command = "sha1sum" + + try: + stdin,stdout,stderr=my_ssh_client.exec_command(remote_command+" "+remote_file) + remote_checksum_value = (stdout.readlines()[0]).split(" ")[0] + except: + raise ValueError("remote checksum check failed (" + remote_file + ")") + + if (checksum_value != remote_checksum_value): + raise ValueError("remote checksum BAD! (" + remote_checksum_value + ")") + + +def backup_file_ssh(file_input, dataset_authority, dataset_identifier, storage_identifier, checksum_type, checksum_value, byte_size=0): + global my_ssh_client + if (my_ssh_client is None): + my_ssh_client = open_ssh_client() + print "ssh client is not defined" + else: + print "reusing the existing ssh client" + + try: + file_transfered = transfer_file(file_input, dataset_authority, dataset_identifier, storage_identifier, byte_size) + except: + raise ValueError("failed to transfer file") + + verify_remote_file(file_transfered, checksum_type, checksum_value) + +def main(): + + print "entering ssh (standalone mode)" + + + print "testing local file:" + try: + file_path="config.ini" + backup_file_ssh("config.ini", "1902.1", "XYZ", "config.ini", "MD5", "8e6995806b1cf27df47c5900869fdd27") + except ValueError: + print "failed to verify file (\"config.ini\")" + else: + print "file ok" + + print "testing file stream:" + try: + file_size = os.stat(file_path).st_size + print ("file size: %d" % file_size) + file_stream = io.open("config.ini", "rb") + backup_file_ssh(file_stream, "1902.1", "XYZ", "config.ini", "MD5", "8e6995806b1cf27df47c5900869fdd27", file_size) + except ValueError: + print "failed to verify file (\"config.ini\")" + else: + print "file ok" + + +if __name__ == "__main__": + main() + + diff --git a/scripts/backup/run_backup/backup_swift.py b/scripts/backup/run_backup/backup_swift.py new file mode 100644 index 00000000000..463c8de0f3b --- /dev/null +++ b/scripts/backup/run_backup/backup_swift.py @@ -0,0 +1,25 @@ +import io +import re +import swiftclient +from config import (ConfigSectionMap) + +def backup_file_swift (file_input, dataset_authority, dataset_identifier, storage_identifier): + auth_url = ConfigSectionMap("Backup")['swiftauthurl'] + auth_version = ConfigSectionMap("Backup")['swiftauthversion'] + user = ConfigSectionMap("Backup")['swiftuser'] + tenant = ConfigSectionMap("Backup")['swifttenant'] + key = ConfigSectionMap("Backup")['swiftkey'] + + conn = swiftclient.Connection( + authurl=auth_url, + user=user, + key=key, + tenant_name=tenant, + auth_version=auth_version + ) + + container_name = dataset_authority + ":" + dataset_identifier + conn.put(container_name) + + conn.put_object(container_name, storage_identifier, file_input) + diff --git a/scripts/backup/run_backup/backupdb.sql b/scripts/backup/run_backup/backupdb.sql new file mode 100644 index 00000000000..85acb2f1d8e --- /dev/null +++ b/scripts/backup/run_backup/backupdb.sql @@ -0,0 +1,31 @@ +CREATE TABLE datafilestatus ( + id integer NOT NULL, + datasetidentifier character varying(255), + storageidentifier character varying(255), + status character varying(255), + createdate timestamp without time zone, + lastbackuptime timestamp without time zone, + lastbackupmethod character varying(16) +); + +ALTER TABLE datafilestatus OWNER TO dvnapp; + +CREATE SEQUENCE datafilestatus_id_seq + START WITH 1 + INCREMENT BY 1 + NO MINVALUE + NO MAXVALUE + CACHE 1; + + +ALTER TABLE datafilestatus_id_seq OWNER TO dvnapp; + +ALTER SEQUENCE datafilestatus_id_seq OWNED BY datafilestatus.id; + +ALTER TABLE ONLY datafilestatus + ADD CONSTRAINT datafilestatus_pkey PRIMARY KEY (id); + +ALTER TABLE ONLY datafilestatus ALTER COLUMN id SET DEFAULT nextval('datafilestatus_id_seq'::regclass); + +ALTER TABLE ONLY datafilestatus + ADD CONSTRAINT datafilestatus_storageidentifier_key UNIQUE (storageidentifier); \ No newline at end of file diff --git a/scripts/backup/run_backup/config.ini b/scripts/backup/run_backup/config.ini new file mode 100644 index 00000000000..b6bc7a89a37 --- /dev/null +++ b/scripts/backup/run_backup/config.ini @@ -0,0 +1,66 @@ +[Database] +; Dataverse database access configuration +; Note that this section is REQUIRED! - +; you must be able to access the database in order to run the backup module. +; The database can run on a remote server; but make sure you configure the +; host and access creds (below) correctly, and make sure Postgres is accepting +; connections from this server address. + +Host: localhost +Port: 5432 +Database: dvndb +Username: dvnapp +Password: xxxxxx +BackupDatabase: backupdb + +[Repository] +; This section provides configuration for accessing (reading) the files stored +; in this Dataverse. Note that the files can be physicall stored on different +; physical media; if you have files in your Dataverse stored via different +; supported storage drivers - filesystem, swift, S3 - as long as access is properly +; configured here, this script should be able to back them up. + +; configuration for files stored on the filesystem +; (the filesystem needs to be accessible by the system running the backup module) + +FileSystemDirectory: /usr/local/glassfish4/glassfish/domains/domain1/files + +; no configuration needed here for reading files stored on AWS/S3 +; (but the S3 authentication credentials need to be provided in the +; standard ~/.aws location) + +; configuration for files stored on openstack/swift: +; swift NOT SUPPORTED yet + +[Backup] +; ssh configuration: +; (i.e., backup to remote storage accessible via ssh/sftp; default) + +StorageType: ssh +SshHost: backup.dataverse.edu +; ssh port is optional, defaults to 22 +SshPort: 22 +SshUsername: backup +; (the remote server must have ssh key access configured for the user +; specified above) +; the directory on the remote server where the files will be copied to: +BackupDirectory: /dataverse_backup + +; Swift configuration: + +;StorageType: swift +SwiftAuthUrl: https://something.dataverse.edu/swift/v2.0/tokens +SwiftAuthVersion: 2 +SwiftUser: xxx +SwiftKey: yyy +; Note that the 'tenant' setting is only needed for Auth v.1 and 2. +SwiftTenant: zzz +SwiftEndPoint: https://something.dataverse.edu/swift/v1 + +; S3 configuration: +; Dataverse files will be backed up onto AWS/S3, in the bucket specified. +; S3 authentication credentials are stored in the +; standard ~/.aws location + +[Notifications] +Email: somebody@dataverse.edu diff --git a/scripts/backup/run_backup/config.py b/scripts/backup/run_backup/config.py new file mode 100644 index 00000000000..8faaa4fa34a --- /dev/null +++ b/scripts/backup/run_backup/config.py @@ -0,0 +1,17 @@ +import ConfigParser +import sys +Config = ConfigParser.ConfigParser() +Config.read("config.ini") + +def ConfigSectionMap(section): + dict1 = {} + options = Config.options(section) + for option in options: + try: + dict1[option] = Config.get(section, option) + if dict1[option] == -1: + sys.stderr.write("skip: %s\n" % option) + except: + print("exception on %s!" % option) + dict1[option] = None + return dict1 diff --git a/scripts/backup/run_backup/database.py b/scripts/backup/run_backup/database.py new file mode 100644 index 00000000000..9c08038092f --- /dev/null +++ b/scripts/backup/run_backup/database.py @@ -0,0 +1,138 @@ +import psycopg2 +import sys +import pprint +from time import (time) +from datetime import (datetime, timedelta) +from config import (ConfigSectionMap) + +dataverse_db_connection=None +backup_db_connection=None + +def create_database_connection(database='database'): + Host = ConfigSectionMap("Database")['host'] + Port = ConfigSectionMap("Database")['port'] + Database = ConfigSectionMap("Database")[database] + Username = ConfigSectionMap("Database")['username'] + Password = ConfigSectionMap("Database")['password'] + + #print "Database Host: %s" % (Host) + #print "Database Port: %s" % (Port) + #print "Database Name: %s" % (Database) + #print "Username: %s" % (Username) + #print "Password: %s" % (Password) + + #Define our connection string + conn_string = "host='"+Host+"' dbname='"+Database+"' user='"+Username+"' password='"+Password+"'" + + #print "Connecting to database\n->%s" % (conn_string) + + # get a connection, if a connect cannot be made an exception will be raised here + conn = psycopg2.connect(conn_string) + + #print "Connected!\n" + + return conn + +def get_backupdb_connection(): + global backup_db_connection + + if backup_db_connection is None: + backup_db_connection = create_database_connection('backupdatabase') + + return backup_db_connection + +def query_database(sinceTimestamp=None): + global dataverse_db_connection + + dataverse_db_connection = create_database_connection() + + cursor = dataverse_db_connection.cursor() + + # Select data files from the database + # The query below is a bit monstrous, as we try to get all the information about the stored file + # from multiple tables in the single request. Note the "LEFT JOIN" in it - we want it to return + # the "datatable" object referencing this datafile, if such exists, or NULL otherwise. If the + # value is not NULL, we know this is a tabular data file. + dataverse_query="SELECT s.authority, s.identifier, o.storageidentifier, f.checksumtype, f.checksumvalue, f.filesize,o.createdate, datatable.id FROM datafile f LEFT JOIN datatable ON f.id = datatable.datafile_id, dataset s, dvobject o WHERE o.id = f.id AND o.owner_id = s.id AND s.harvestingclient_id IS null" + if sinceTimestamp is None: + cursor.execute(dataverse_query) + else: + dataverse_query = dataverse_query+" AND o.createdate > %s" + cursor.execute(dataverse_query, (sinceTimestamp,)) + + + records = cursor.fetchall() + + return records + +def get_last_timestamp(): + backup_db_connection = get_backupdb_connection() + + cursor = backup_db_connection.cursor() + + # select the last timestamp from the datafilestatus table: + dataverse_query="SELECT createdate FROM datafilestatus ORDER BY createdate DESC LIMIT 1" + + cursor.execute(dataverse_query) + + record = cursor.fetchone() + + if record is None: + #print "table is empty" + return None + + #timestamp = record[0] + timedelta(seconds=1) + timestamp = record[0] + # milliseconds are important! + timestamp_str = timestamp.strftime('%Y-%m-%d %H:%M:%S.%f') + + return timestamp_str + +def get_datafile_status(dataset_authority, dataset_identifier, storage_identifier): + backup_db_connection = get_backupdb_connection() + cursor = backup_db_connection.cursor() + + # select the last timestamp from the datafilestatus table: + + dataverse_query="SELECT status FROM datafilestatus WHERE datasetidentifier=%s AND storageidentifier=%s;" + + dataset_id=dataset_authority+"/"+dataset_identifier + + cursor.execute(dataverse_query, (dataset_id, storage_identifier)) + + record = cursor.fetchone() + + if record is None: + #print "no backup status for this file" + return None + + backupstatus = record[0] + #print "last backup status: "+backupstatus + return backupstatus + +def record_datafile_status(dataset_authority, dataset_identifier, storage_identifier, status, createdate): + current_status = get_datafile_status(dataset_authority, dataset_identifier, storage_identifier) + + backup_db_connection = get_backupdb_connection() + cursor = backup_db_connection.cursor() + + createdate_str = createdate.strftime('%Y-%m-%d %H:%M:%S.%f') + nowdate_str = datetime.fromtimestamp(time()).strftime('%Y-%m-%d %H:%M:%S') + + if current_status is None: + query = "INSERT INTO datafilestatus (status, createdate, lastbackuptime, lastbackupmethod, datasetidentifier, storageidentifier) VALUES (%s, %s, %s, %s, %s, %s);" + else: + query = "UPDATE datafilestatus SET status=%s, createdate=%s, lastbackuptime=%s, lastbackupmethod=%s WHERE datasetidentifier=%s AND storageidentifier=%s;" + + dataset_id=dataset_authority+"/"+dataset_identifier + backup_method = ConfigSectionMap("Backup")['storagetype'] + + cursor.execute(query, (status, createdate_str, nowdate_str, backup_method, dataset_id, storage_identifier)) + + # finalize transaction: + backup_db_connection.commit() + cursor.close() + + + + diff --git a/scripts/backup/run_backup/email_notification.py b/scripts/backup/run_backup/email_notification.py new file mode 100644 index 00000000000..ed3504b8682 --- /dev/null +++ b/scripts/backup/run_backup/email_notification.py @@ -0,0 +1,25 @@ +from config import (ConfigSectionMap) +from subprocess import Popen, PIPE, STDOUT +from time import (time) +from datetime import (datetime) + +def send_notification(text): + try: + notification_address = ConfigSectionMap("Notifications")['email'] + except: + notification_address = None + + if (notification_address is None): + raise ValueError('Notification email address is not configured') + + nowdate_str = datetime.fromtimestamp(time()).strftime('%Y-%m-%d %H:%M') + subject_str = ('Dataverse datafile backup report [%s]' % nowdate_str) + + p = Popen(['mail','-s',subject_str,notification_address], stdout=PIPE, stdin=PIPE, stderr=PIPE) + stdout_data = p.communicate(input=text)[0] + +def main(): + send_notification('backup report: test, please disregard') + +if __name__ == "__main__": + main() diff --git a/scripts/backup/run_backup/requirements.txt b/scripts/backup/run_backup/requirements.txt new file mode 100644 index 00000000000..5696d138e91 --- /dev/null +++ b/scripts/backup/run_backup/requirements.txt @@ -0,0 +1,6 @@ +# python2 requirements + +psycopg2 +boto3 +paramiko +# TODO: where to get `swiftclient` from diff --git a/scripts/backup/run_backup/run_backup.py b/scripts/backup/run_backup/run_backup.py new file mode 100644 index 00000000000..7124d21ae2e --- /dev/null +++ b/scripts/backup/run_backup/run_backup.py @@ -0,0 +1,99 @@ +#!/usr/bin/env python + +import ConfigParser +import psycopg2 +import sys +import io +import re +from database import (query_database, get_last_timestamp, record_datafile_status, get_datafile_status) +from storage import (open_dataverse_file) +from backup import (backup_file) +from email_notification import (send_notification) + +def main(): + rrmode = False + + if (len(sys.argv) > 1 and sys.argv[1] == '--rerun'): + rrmode = True + + if rrmode: + time_stamp = None + else: + time_stamp = get_last_timestamp() + + if time_stamp is None: + print "No time stamp! first run (or a full re-run)." + records = query_database() + else: + print "last backup: "+time_stamp + records = query_database(time_stamp) + + files_total=0 + files_success=0 + files_failed=0 + files_skipped=0 + + for result in records: + dataset_authority = result[0] + dataset_identifier = result[1] + storage_identifier = result[2] + checksum_type = result[3] + checksum_value = result[4] + file_size = result[5] + create_time = result[6] + is_tabular_data = result[7] + + if (checksum_value is None): + checksum_value = "MISSING" + + + if (storage_identifier is not None and dataset_identifier is not None and dataset_authority is not None): + files_total += 1 + print dataset_authority + "/" + dataset_identifier + "/" + storage_identifier + ", " + checksum_type + ": " + checksum_value + + file_input=None + + # if this is a re-run, we are only re-trying the files that have failed previously: + if (rrmode and get_datafile_status(dataset_authority, dataset_identifier, storage_identifier) == 'OK'): + files_skipped += 1 + continue + + try: + file_input = open_dataverse_file(dataset_authority, dataset_identifier, storage_identifier, is_tabular_data) + except: + print "failed to open file "+storage_identifier + file_input=None + + + if (file_input is not None): + try: + backup_file(file_input, dataset_authority, dataset_identifier, storage_identifier, checksum_type, checksum_value, file_size) + print "backed up file "+storage_identifier + record_datafile_status(dataset_authority, dataset_identifier, storage_identifier, 'OK', create_time) + files_success += 1 + except ValueError, ve: + exception_message = str(ve) + print "failed to back up file "+storage_identifier+": "+exception_message + if (re.match("^remote", exception_message) is not None): + record_datafile_status(dataset_authority, dataset_identifier, storage_identifier, 'FAIL_VERIFY', create_time) + else: + record_datafile_status(dataset_authority, dataset_identifier, storage_identifier, 'FAIL_WRITE', create_time) + files_failed += 1 + #TODO: add a separate failure status 'FAIL_VERIFY' - for when it looked like we were able to copy the file + # onto the remote storage system, but the checksum verification failed (?) + else: + record_datafile_status(dataset_authority, dataset_identifier, storage_identifier, 'FAIL_READ', create_time) + files_failed += 1 + + if (files_skipped > 0): + report = ('backup script run report: %d files processed; %d skipped (already backed up), %d success, %d failed' % (files_total, files_skipped, files_success, files_failed)) + else: + report = ('backup script run report: %d files processed; %d success, %d failed' % (files_total, files_success, files_failed)) + print report + send_notification(report) + +if __name__ == "__main__": + main() + + + diff --git a/scripts/backup/run_backup/storage.py b/scripts/backup/run_backup/storage.py new file mode 100644 index 00000000000..b831e7e003e --- /dev/null +++ b/scripts/backup/run_backup/storage.py @@ -0,0 +1,28 @@ +import io +import re +import boto3 +from config import (ConfigSectionMap) +from storage_filesystem import (open_storage_object_filesystem) +from storage_s3 import (open_storage_object_s3) + + +def open_dataverse_file(dataset_authority, dataset_identifier, storage_identifier, is_tabular_data): + m = re.search('^([a-z0-9]*)://(.*)$', storage_identifier) + if m is None: + # no storage identifier tag. (defaulting to filesystem storage) + storageTag = 'file' + objectLocation = storage_identifier; + else: + storageTag = m.group(1) + objectLocation = m.group(2) + + if storageTag == 'file': + byteStream = open_storage_object_filesystem(dataset_authority, dataset_identifier, objectLocation, is_tabular_data) + return byteStream + elif storageTag == 's3': + byteStream = open_storage_object_s3(dataset_authority, dataset_identifier, objectLocation, is_tabular_data) + return byteStream + elif storageTag == 'swift': + raise ValueError("backup of swift objects not supported yet") + + raise ValueError("Unknown or unsupported storage method: "+storage_identifier) diff --git a/scripts/backup/run_backup/storage_filesystem.py b/scripts/backup/run_backup/storage_filesystem.py new file mode 100644 index 00000000000..f5cff99a91b --- /dev/null +++ b/scripts/backup/run_backup/storage_filesystem.py @@ -0,0 +1,11 @@ +import io +import re +from config import (ConfigSectionMap) + +def open_storage_object_filesystem(dataset_authority, dataset_identifier, object_location, is_tabular_data): + filesystem_directory = ConfigSectionMap("Repository")['filesystemdirectory'] + if (is_tabular_data is not None): + object_location += ".orig" + file_path = filesystem_directory+"/"+dataset_authority+"/"+dataset_identifier+"/"+object_location + byte_stream = io.open(file_path, "rb") + return byte_stream diff --git a/scripts/backup/run_backup/storage_s3.py b/scripts/backup/run_backup/storage_s3.py new file mode 100644 index 00000000000..94858ee21b7 --- /dev/null +++ b/scripts/backup/run_backup/storage_s3.py @@ -0,0 +1,13 @@ +import io +import re +import boto3 + +def open_storage_object_s3(dataset_authority, dataset_identifier, object_location, is_tabular_data): + s3 = boto3.resource('s3') + bucket_name,object_name = object_location.split(":",1) + key = dataset_authority + "/" + dataset_identifier + "/" + object_name; + if (is_tabular_data is not None): + key += ".orig" + s3_obj = s3.Object(bucket_name=bucket_name, key=key) + # "Body" is a byte stream associated with the object: + return s3_obj.get()['Body'] diff --git a/scripts/database/upgrades/upgrade_v4.8.3_to_v4.8.4.sql b/scripts/database/upgrades/upgrade_v4.8.3_to_v4.8.4.sql new file mode 100644 index 00000000000..670a2d191db --- /dev/null +++ b/scripts/database/upgrades/upgrade_v4.8.3_to_v4.8.4.sql @@ -0,0 +1,2 @@ +-- Google login has used 131 characters. 64 is not enough. +ALTER TABLE oauth2tokendata ALTER COLUMN accesstoken TYPE text; diff --git a/scripts/database/upgrades/upgrade_v4.8.5_to_v4.9.sql b/scripts/database/upgrades/upgrade_v4.8.5_to_v4.9.sql new file mode 100644 index 00000000000..748b516e030 --- /dev/null +++ b/scripts/database/upgrades/upgrade_v4.8.5_to_v4.9.sql @@ -0,0 +1,4 @@ +ALTER TABLE externaltool ADD COLUMN type character varying(255); +ALTER TABLE externaltool ALTER COLUMN type SET NOT NULL; +-- Previously, the only explore tool was TwoRavens. We now persist the name of the tool. +UPDATE guestbookresponse SET downloadtype = 'TwoRavens' WHERE downloadtype = 'Explore'; diff --git a/scripts/installer/Makefile b/scripts/installer/Makefile index df29cda93a5..046e6cb73cd 100644 --- a/scripts/installer/Makefile +++ b/scripts/installer/Makefile @@ -45,10 +45,10 @@ ${GLASSFISH_SETUP_SCRIPT}: glassfish-setup.sh /bin/cp glassfish-setup.sh ${INSTALLER_ZIP_DIR} -${POSTGRES_DRIVERS}: pgdriver/postgresql-8.4-703.jdbc4.jar pgdriver/postgresql-9.0-802.jdbc4.jar pgdriver/postgresql-9.1-902.jdbc4.jar +${POSTGRES_DRIVERS}: pgdriver/postgresql-8.4-703.jdbc4.jar pgdriver/postgresql-9.0-802.jdbc4.jar pgdriver/postgresql-9.1-902.jdbc4.jar pgdriver/postgresql-9.2-1004.jdbc4.jar pgdriver/postgresql-9.3-1104.jdbc4.jar pgdriver/postgresql-9.4.1212.jar pgdriver/postgresql-42.1.4.jar @echo copying postgres drviers @mkdir -p ${POSTGRES_DRIVERS} - /bin/cp pgdriver/postgresql-8.4-703.jdbc4.jar pgdriver/postgresql-9.0-802.jdbc4.jar pgdriver/postgresql-9.1-902.jdbc4.jar ${INSTALLER_ZIP_DIR}/pgdriver + /bin/cp pgdriver/postgresql-8.4-703.jdbc4.jar pgdriver/postgresql-9.0-802.jdbc4.jar pgdriver/postgresql-9.1-902.jdbc4.jar pgdriver/postgresql-9.2-1004.jdbc4.jar pgdriver/postgresql-9.3-1104.jdbc4.jar pgdriver/postgresql-9.4.1212.jar pgdriver/postgresql-42.1.4.jar ${INSTALLER_ZIP_DIR}/pgdriver ${API_SCRIPTS}: ../api/setup-datasetfields.sh ../api/setup-users.sh ../api/setup-dvs.sh ../api/setup-identity-providers.sh ../api/setup-all.sh ../api/post-install-api-block.sh ../api/setup-builtin-roles.sh ../api/data @echo copying api scripts diff --git a/scripts/installer/install b/scripts/installer/install index a620cb00eaa..1644014a8d1 100755 --- a/scripts/installer/install +++ b/scripts/installer/install @@ -61,8 +61,6 @@ else 'SOLR_LOCATION', - 'TWORAVENS_LOCATION', - 'RSERVE_HOST', 'RSERVE_PORT', 'RSERVE_USER', @@ -88,8 +86,6 @@ my %CONFIG_DEFAULTS = ( 'SOLR_LOCATION', 'LOCAL', - 'TWORAVENS_LOCATION', 'NOT INSTALLED', - 'RSERVE_HOST', 'localhost', 'RSERVE_PORT', 6311, 'RSERVE_USER', 'rserve', @@ -112,8 +108,6 @@ my %CONFIG_PROMPTS = ( 'SOLR_LOCATION', 'Remote SOLR indexing service', - 'TWORAVENS_LOCATION', 'Will this Dataverse be using TwoRavens application', - 'RSERVE_HOST', 'Rserve Server', 'RSERVE_PORT', 'Rserve Server Port', 'RSERVE_USER', 'Rserve User Name', @@ -138,8 +132,6 @@ my %CONFIG_COMMENTS = ( 'SOLR_LOCATION', "? \n - Leave this set to \"LOCAL\" if the SOLR will be running on the same (this) server.\n Otherwise, please enter the host AND THE PORT NUMBER of the remote SOLR service, colon-separated\n (for example: foo.edu:8983)\n: ", - 'TWORAVENS_LOCATION', "? \n - If so, please provide the complete URL of the TwoRavens GUI under rApache,\n for example, \"https://foo.edu/dataexplore/gui.html\".\n (PLEASE NOTE, TwoRavens will need to be installed separately! - see the installation docs for more info)\n: ", - 'RSERVE_HOST', '', 'RSERVE_PORT', '', 'RSERVE_USER', '', @@ -155,15 +147,14 @@ my $API_URL = "http://localhost:8080/api"; # doesn't get paranoid) my %POSTGRES_DRIVERS = ( - # "8_4", "postgresql-8.3-603.jdbc4.jar", "8_4", "postgresql-8.4-703.jdbc4.jar", "9_0", "postgresql-9.0-802.jdbc4.jar", "9_1", "postgresql-9.1-902.jdbc4.jar", - "9_2", "postgresql-9.1-902.jdbc4.jar", - "9_3", "postgresql-9.1-902.jdbc4.jar", - "9_4", "postgresql-9.1-902.jdbc4.jar", - "9_5", "postgresql-9.1-902.jdbc4.jar", - "9_6", "postgresql-9.1-902.jdbc4.jar" + "9_2", "postgresql-9.2-1004.jdbc4.jar", + "9_3", "postgresql-9.3-1104.jdbc4.jar", + "9_4", "postgresql-9.4.1212.jar", + "9_5", "postgresql-42.1.4.jar", + "9_6", "postgresql-42.1.4.jar" ); # A few preliminary checks: @@ -967,6 +958,35 @@ if ( -e "/proc/meminfo" && open MEMINFO, "/proc/meminfo" ) { close MEMINFO; +# TODO: Figure out how to determine the amount of memory when running in Docker +# because we're wondering if Dataverse can run in the free OpenShift Online +# offering that only gives you 1 GB of memory. Obviously, if this is someone's +# first impression of Dataverse, we want to to run well! What if you try to +# ingest a large file or perform other memory-intensive operations? For more +# context, see https://github.com/IQSS/dataverse/issues/4040#issuecomment-331282286 + if ( -e "/sys/fs/cgroup/memory/memory.limit_in_bytes" && open CGROUPMEM, "/sys/fs/cgroup/memory/memory.limit_in_bytes" ) { + print "We must be running in Docker! Fancy!\n"; + while ( my $limitline = ) { + # The goal of this cgroup check is for + # "Setting the heap limit for Glassfish to 750MB" + # to change to some other value, based on memory available. + print "/sys/fs/cgroup/memory/memory.limit_in_bytes: $limitline\n"; + my $limit_in_kb = $limitline / 1024; + print "Docker limit_in_kb = $limit_in_kb but ignoring\n"; + # In openshift.json, notice how PostgreSQL and Solr have + # resources.limits.memory set to "256Mi". + # If you try to give the Dataverse/Glassfish container twice + # as much memory (512 MB) and allow $sys_mem_total to + # be set below, you should see the following: + # "Setting the heap limit for Glassfish to 192MB." + # FIXME: dataverse.war will not deploy with only 512 GB of memory. + # Again, the goal is 1 GB total (512MB + 256MB + 256MB) for + # Glassfish, PostgreSQL, and Solr to fit in the free OpenShift tier. + #print "setting sys_mem_total to: $limit_in_kb\n"; + #$sys_mem_total = $limit_in_kb; + } + close CGROUPMEM; + } } elsif ( -x "/usr/sbin/sysctl" ) { # MacOS X, probably... @@ -1287,40 +1307,7 @@ else print "OK.\n\n"; } -# b. If this installation is going to be using TwoRavens, configure its location in the Dataverse settings; -# Otherwise, set the "NO TwoRavens FOR YOU!" option in the settings: - - -if ($CONFIG_DEFAULTS{'TWORAVENS_LOCATION'} ne 'NOT INSTALLED') -{ - print "Executing " . "curl -X PUT -d " . $CONFIG_DEFAULTS{'TWORAVENS_LOCATION'} . " " . $API_URL . "/admin/settings/:TwoRavensUrl" . "\n"; - my $exit_code = system("curl -X PUT -d " . $CONFIG_DEFAULTS{'TWORAVENS_LOCATION'} . " " . $API_URL . "/admin/settings/:TwoRavensUrl"); - if ( $exit_code ) - { - print "WARNING: failed to configure the location of the TwoRavens app in the Dataverse settings!\n\n"; - } - else - { - print "OK.\n\n"; - } - - # (and, we also need to explicitly set the tworavens option to "true": - $exit_code = system("curl -X PUT -d true " . $API_URL . "/admin/settings/:TwoRavensTabularView"); - -} else { - print "Executing " . "curl -X PUT -d false " . $API_URL . "/admin/settings/:TwoRavensTabularView" . "\n"; - my $exit_code = system("curl -X PUT -d false " . $API_URL . "/admin/settings/:TwoRavensTabularView"); - if ( $exit_code ) - { - print "WARNING: failed to disable the TwoRavens app in the Dataverse settings!\n\n"; - } - else - { - print "OK.\n\n"; - } -} - -# c. If this installation is going to be using a remote SOLR search engine service, configure its location in the settings: +# b. If this installation is going to be using a remote SOLR search engine service, configure its location in the settings: if ($CONFIG_DEFAULTS{'SOLR_LOCATION'} ne 'LOCAL') { diff --git a/scripts/installer/pgdriver/postgresql-42.1.4.jar b/scripts/installer/pgdriver/postgresql-42.1.4.jar new file mode 100644 index 00000000000..08a54b105f8 Binary files /dev/null and b/scripts/installer/pgdriver/postgresql-42.1.4.jar differ diff --git a/scripts/installer/pgdriver/postgresql-9.2-1004.jdbc4.jar b/scripts/installer/pgdriver/postgresql-9.2-1004.jdbc4.jar new file mode 100644 index 00000000000..b9270d21b21 Binary files /dev/null and b/scripts/installer/pgdriver/postgresql-9.2-1004.jdbc4.jar differ diff --git a/scripts/installer/pgdriver/postgresql-9.3-1104.jdbc4.jar b/scripts/installer/pgdriver/postgresql-9.3-1104.jdbc4.jar new file mode 100644 index 00000000000..a79525d7a00 Binary files /dev/null and b/scripts/installer/pgdriver/postgresql-9.3-1104.jdbc4.jar differ diff --git a/scripts/installer/pgdriver/postgresql-9.4.1212.jar b/scripts/installer/pgdriver/postgresql-9.4.1212.jar new file mode 100644 index 00000000000..b0de752d880 Binary files /dev/null and b/scripts/installer/pgdriver/postgresql-9.4.1212.jar differ diff --git a/scripts/search/data/tabular/50by1000.dta.zip b/scripts/search/data/tabular/50by1000.dta.zip new file mode 100644 index 00000000000..4280a0608fa Binary files /dev/null and b/scripts/search/data/tabular/50by1000.dta.zip differ diff --git a/scripts/vagrant/install-tworavens.sh b/scripts/vagrant/install-tworavens.sh index 0d1c91130ac..f6957afc58a 100755 --- a/scripts/vagrant/install-tworavens.sh +++ b/scripts/vagrant/install-tworavens.sh @@ -2,7 +2,7 @@ echo "This script is highly experimental and makes many assumptions about how Dataverse is running in Vagrant. Please consult the TwoRavens section of the Dataverse Installation Guide instead." exit 1 cd /root -yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm +yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum install -y R R-devel # FIXME: /dataverse is mounted in Vagrant but not other places yum install -y /dataverse/doc/sphinx-guides/source/_static/installation/files/home/rpmbuild/rpmbuild/RPMS/x86_64/rapache-1.2.6-rpm0.x86_64.rpm @@ -16,9 +16,10 @@ if [ ! -f $COMMIT ]; then ./r-setup.sh # This is expected to take a while. Look for lines like "Package Zelig successfully installed" and "Successfully installed Dataverse R framework". fi # FIXME: copy preprocess.R into Glassfish while running and overwrite it -curl -X PUT -d true http://localhost:8080/api/admin/settings/:TwoRavensTabularView +# FIXME: enable TwoRavens by POSTing twoRavens.json. See note below about port 8888 vs 8080. +# TODO: programatically edit twoRavens.json to change "toolUrl" to "http://localhost:8888/dataexplore/gui.html" +curl -X POST -H 'Content-type: application/json' --upload-file /dataverse/doc/sphinx-guides/source/_static/installation/files/root/external-tools/twoRavens.json http://localhost:8080/api/admin/externalTools # Port 8888 because we're running in Vagrant. On the dev1 server we use https://dev1.dataverse.org/dataexplore/gui.html -curl -X PUT -d http://localhost:8888/dataexplore/gui.html http://localhost:8080/api/admin/settings/:TwoRavensUrl cd /root DIR=/var/www/html/dataexplore if [ ! -d $DIR ]; then diff --git a/scripts/vagrant/setup.sh b/scripts/vagrant/setup.sh index 5789450aa51..0ab2daf93a2 100644 --- a/scripts/vagrant/setup.sh +++ b/scripts/vagrant/setup.sh @@ -14,7 +14,7 @@ cp /dataverse/conf/vagrant/etc/yum.repos.d/epel-apache-maven.repo /etc/yum.repos # Uncomment this (and other shib stuff below) if you want # to use Vagrant (and maybe PageKite) to test Shibboleth. #yum install -y shibboleth shibboleth-embedded-ds -yum install -y java-1.8.0-openjdk-devel postgresql-server apache-maven httpd mod_ssl +yum install -y java-1.8.0-openjdk-devel postgresql-server apache-maven httpd mod_ssl unzip alternatives --set java /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java alternatives --set javac /usr/lib/jvm/java-1.8.0-openjdk.x86_64/bin/javac java -version @@ -64,3 +64,9 @@ service httpd start #service shibd restart #curl -k --sslv3 https://pdurbin.pagekite.me/Shibboleth.sso/Metadata > /downloads/pdurbin.pagekite.me #service httpd restart +echo "#########################################################################################" +echo "# This is a Vagrant test box, so we're disabling firewalld. # +echo "# Re-enable it with $ sudo systemctl enable firewalld && sudo systemctl start firewalld #" +echo "#########################################################################################" +systemctl disable firewalld +systemctl stop firewalld diff --git a/src/main/java/Bundle.properties b/src/main/java/Bundle.properties index dc0219f519c..53a9b236230 100755 --- a/src/main/java/Bundle.properties +++ b/src/main/java/Bundle.properties @@ -158,7 +158,7 @@ notification.createDataset={0} was created in {1}. To learn more about what you notification.dataset.management.title=Dataset Management - Dataset User Guide notification.wasSubmittedForReview={0} was submitted for review to be published in {1}. Don''t forget to publish it or send it back to the contributor\! notification.wasReturnedByReviewer={0} was returned by the curator of {1}. -notification.wasPublished={0}, was published in {1}. +notification.wasPublished={0} was published in {1}. notification.worldMap.added={0}, dataset had WorldMap layer data added to it. notification.maplayer.deletefailed=Failed to delete the map layer associated with the restricted file {0} from WorldMap. Please try again, or contact WorldMap and/or Dataverse support. (Dataset: {1}) notification.generic.objectDeleted=The dataverse, dataset, or file for this notification has been deleted. @@ -246,6 +246,7 @@ login.error=Error validating the username, email address, or password. Please tr user.error.cannotChangePassword=Sorry, your password cannot be changed. Please contact your system administrator. user.error.wrongPassword=Sorry, wrong password. login.button=Log In with {0} +login.button.orcid=Create or Connect your ORCID # authentication providers auth.providers.title=Other options auth.providers.tip=You can convert a Dataverse account to use one of the options above. Learn more. @@ -259,6 +260,7 @@ auth.providers.persistentUserIdName.orcid=ORCID iD auth.providers.persistentUserIdName.github=ID auth.providers.persistentUserIdTooltip.orcid=ORCID provides a persistent digital identifier that distinguishes you from other researchers. auth.providers.persistentUserIdTooltip.github=GitHub assigns a unique number to every user. +auth.providers.orcid.insufficientScope=Dataverse was not granted the permission to read user data from ORCID. # Friendly AuthenticationProvider names authenticationProvider.name.builtin=Dataverse authenticationProvider.name.null=(provider is unknown) @@ -331,7 +333,7 @@ oauth2.convertAccount.success=Your Dataverse account is now associated with your # oauth2/callback.xhtml oauth2.callback.page.title=OAuth Callback -oauth2.callback.message=OAuth2 Error - Sorry, the identification process did not succeed. +oauth2.callback.message=Authentication Error - Dataverse could not authenticate your ORCID login. Please make sure you authorize your ORCID account to connect with Dataverse. For more details about the information being requested, see the User Guide. # tab on dataverseuser.xhtml apitoken.title=API Token @@ -747,9 +749,9 @@ dataverse.results.cards.foundInMetadata=Found in Metadata Fields: dataverse.results.cards.files.tabularData=Tabular Data dataverse.results.solrIsDown=Please note: Due to an internal error, browsing and searching is not available. dataverse.theme.title=Theme -dataverse.theme.inheritCustomization.title=Check this to use the existing theme. -dataverse.theme.inheritCustomization.label=Inherit Customization -dataverse.theme.inheritCustomization.checkbox=Inherit customization from {0} +dataverse.theme.inheritCustomization.title=For this dataverse, use the same theme as the parent dataverse. +dataverse.theme.inheritCustomization.label=Inherit Theme +dataverse.theme.inheritCustomization.checkbox=Inherit theme from {0} dataverse.theme.logo=Logo dataverse.theme.logo.tip=Supported image types are JPG, TIF, or PNG and should be no larger than 500 KB. The maximum display size for an image file in a dataverse's theme is 940 pixels wide by 120 pixels high. dataverse.theme.logo.format=Logo Format @@ -792,6 +794,7 @@ dataverse.theme.website.title=URL for your personal website, institution, or any dataverse.theme.website.tip=The website will be linked behind the tagline. To have a website listed, you must also provide a tagline. dataverse.theme.website.watermark=Your personal site, http://... dataverse.theme.website.invalidMsg=Invalid URL. +dataverse.theme.disabled=The theme for the root dataverse has been administratively disabled with the :DisableRootDataverseTheme database setting. dataverse.widgets.title=Widgets dataverse.widgets.notPublished.why.header=Why Use Widgets? dataverse.widgets.notPublished.why.reason1=Increases the web visibility of your data by allowing you to embed your dataverse and datasets into your personal or project website. @@ -1088,6 +1091,7 @@ dataset.guestbookResponse.guestbook.additionalQuestions=Additional Questions dataset.guestbookResponse.guestbook.responseTooLong=Please limit response to 255 characters # dataset.xhtml +dataset.configureBtn=Configure dataset.pageTitle=Add New Dataset dataset.editBtn=Edit dataset.editBtn.itemLabel.upload=Files (Upload) @@ -1104,6 +1108,7 @@ dataset.editBtn.itemLabel.deaccession=Deaccession Dataset dataset.exportBtn=Export Metadata dataset.exportBtn.itemLabel.ddi=DDI dataset.exportBtn.itemLabel.dublinCore=Dublin Core +dataset.exportBtn.itemLabel.schemaDotOrg=Schema.org JSON-LD dataset.exportBtn.itemLabel.json=JSON metrics.title=Metrics metrics.title.tip=View more metrics information @@ -1117,9 +1122,9 @@ dataset.publish.header=Publish Dataset dataset.rejectBtn=Return to Author dataset.submitBtn=Submit for Review dataset.disabledSubmittedBtn=Submitted for Review -dataset.submitMessage=Submit this dataset for review by the curator of this dataverse for possible publishing. +dataset.submitMessage=You will not be able to make changes to this dataset while it is in review. dataset.submit.success=Your dataset has been submitted for review. -dataset.inreview.infoMessage=This dataset has been submitted for review. +dataset.inreview.infoMessage=\u2013 This dataset is currently under review prior to publication. dataset.submit.failure=Dataset Submission Failed - {0} dataset.submit.failure.null=Can't submit for review. Dataset is null. dataset.submit.failure.isReleased=Latest version of dataset is already released. Only draft versions can be submitted for review. @@ -1155,6 +1160,7 @@ dataset.share.datasetShare=Share Dataset dataset.share.datasetShare.tip=Share this dataset on your favorite social media networks. dataset.share.datasetShare.shareText=View this dataset. dataset.locked.message=Dataset Locked +dataset.locked.inReview.message=Submitted for Review dataset.publish.error=This dataset may not be published because the {0} Service is currently inaccessible. Please try again. Does the issue continue to persist? dataset.publish.error.doi=This dataset may not be published because the DOI update failed. dataset.delete.error=Could not deaccession the dataset because the {0} update failed. @@ -1183,7 +1189,10 @@ dataset.asterisk.tip=Asterisks indicate required fields dataset.message.uploadFiles=Upload Dataset Files - You can drag and drop files from your desktop, directly into the upload widget. dataset.message.editMetadata=Edit Dataset Metadata - Add more metadata about this dataset to help others easily find it. dataset.message.editTerms=Edit Dataset Terms - Update this dataset's terms of use. -dataset.message.locked=Dataset Locked +dataset.message.locked.editNotAllowedInReview=Dataset cannot be edited due to In Review dataset lock. +dataset.message.locked.downloadNotAllowedInReview=Dataset file(s) may not be downloaded due to In Review dataset lock. +dataset.message.locked.downloadNotAllowed=Dataset file(s) may not be downloaded due to dataset lock. +dataset.message.locked.editNotAllowed=Dataset cannot be edited due to dataset lock. dataset.message.createSuccess=This dataset has been created. dataset.message.linkSuccess= {0} has been successfully linked to {1}. dataset.message.metadataSuccess=The metadata for this dataset has been updated. @@ -1197,7 +1206,6 @@ dataset.message.bulkFileDeleteSuccess=The selected files have been deleted. datasetVersion.message.deleteSuccess=This dataset draft has been deleted. datasetVersion.message.deaccessionSuccess=The selected version(s) have been deaccessioned. dataset.message.deaccessionSuccess=This dataset has been deaccessioned. -dataset.message.files.ingestSuccess=The file(s) have been successfully ingested. You can now explore them with TwoRavens or download them in alternative formats. dataset.message.validationError=Validation Error - Required fields were missed or there was a validation error. Please scroll down to see details. dataset.message.publishFailure=The dataset could not be published. dataset.message.metadataFailure=The metadata could not be updated. @@ -1249,10 +1257,10 @@ file.count.selected={0} {0, choice, 0#Files Selected|1#File Selected|2#Files Sel file.selectToAddBtn=Select Files to Add file.selectToAdd.tipLimit=File upload limit is {0} bytes per file. file.selectToAdd.tipMoreInformation=For more information about supported file formats, please refer to the User Guide. +file.selectToAdd.dragdropMsg=Drag and drop files here. file.createUploadDisabled=Once you have saved your dataset, you can upload your data using the "Upload Files" button on the dataset page. For more information about supported file formats, please refer to the User Guide. file.fromDropbox=Upload from Dropbox file.fromDropbox.tip=Files can also be uploaded directly from Dropbox. -file.fromDropbox.description=Drag and drop files here. file.replace.original=Original File file.editFiles=Edit Files file.bulkUpdate=Bulk Update @@ -1275,6 +1283,7 @@ file.restrict=Restrict file.unrestrict=Unrestrict file.restricted.success=Files "{0}" will be restricted once you click on the Save Changes button. file.download.header=Download +file.download.subset.header=Download Data Subset file.preview=Preview: file.previewMap=Preview Map:o file.fileName=File Name @@ -1297,7 +1306,7 @@ file.rsyncUpload.step2.downloadScriptButton=Download Script file.rsyncUpload.step3=Open a terminal window in the same directory you saved the script and run this command: bash ./{0} file.rsyncUpload.step4=Follow the instructions in the script. It will ask for a full path (beginning with "/") to the directory containing your data. Note: this script will expire after 7 days. file.rsyncUpload.inProgressMessage.summary=DCM File Upload -file.rsyncUpload.inProgressMessage.details=- this dataset is locked until the data files have been transferred and verified. +file.rsyncUpload.inProgressMessage.details=This dataset is locked until the data files have been transferred and verified. file.metaData.dataFile.dataTab.variables=Variables file.metaData.dataFile.dataTab.observations=Observations @@ -1333,8 +1342,8 @@ file.spss-savEncoding.current=Current Selection: file.spss-porExtraLabels=Variable Labels file.spss-porExtraLabels.title=Upload an additional text file with extra variable labels. file.spss-porExtraLabels.selectToAddBtn=Select File to Add -file.ingestFailed=Tabular Data Ingest Failed -file.explore.twoRavens=TwoRavens +file.ingestFailed.header=Upload Completed with Errors +file.ingestFailed.message=Tabular data ingest failed. file.map=Map file.mapData=Map Data file.mapData.worldMap=WorldMap @@ -1506,9 +1515,8 @@ file.viewDiffDialog.msg.versionNotFound=Version "{0}" was not found. file.metadataTip=Metadata Tip: After adding the dataset, click the Edit Dataset button to add more metadata. file.addBtn=Save Dataset file.dataset.allFiles=All Files from this Dataset -file.downloadDialog.header=Download File -file.downloadDialog.tip=Please confirm and/or complete the information needed below in order to download files in this dataset. -file.downloadDialog.termsTip=I accept these Terms of Use. +file.downloadDialog.header=Dataset Terms +file.downloadDialog.tip=Please confirm and/or complete the information needed below in order to continue. file.requestAccessTermsDialog.tip=Please confirm and/or complete the information needed below in order to request access to files in this dataset. file.search.placeholder=Search this dataset... file.results.btn.sort=Sort @@ -1519,10 +1527,12 @@ file.results.btn.sort.option.oldest=Oldest file.results.btn.sort.option.size=Size file.results.btn.sort.option.type=Type file.compute.fileRestricted=File Restricted -file.compute.fileAccessDenied=You cannot compute on this restricted file because you don't have permission to access it. +file.compute.fileAccessDenied=You cannot compute on this restricted file because you do not have permission to access it. +file.configure.Button=Configure +file.configure.launchMessage.details=Please refresh this page once you have finished configuring your dataset.compute.datasetCompute=Dataset Compute Not Supported -dataset.compute.datasetAccessDenied=You cannot compute on this dataset because you don't have permission to access all of the restricted files. -dataset.compute.datasetComputeDisabled=You cannot compute on this dataset because this functionality is not enabled yet. Please click on a file to access computing capalibities. +dataset.compute.datasetAccessDenied=You cannot compute on this dataset because you do not have permission to access all of the restricted files. +dataset.compute.datasetComputeDisabled=You cannot compute on this dataset because this functionality is not enabled yet. Please click on a file to access computing features. # dataset-widgets.xhtml dataset.widgets.title=Dataset Thumbnail + Widgets diff --git a/src/main/java/edu/harvard/iq/dataverse/ConfigureFragmentBean.java b/src/main/java/edu/harvard/iq/dataverse/ConfigureFragmentBean.java new file mode 100644 index 00000000000..0899b180dcc --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/ConfigureFragmentBean.java @@ -0,0 +1,90 @@ +/* + * To change this license header, choose License Headers in Project Properties. + * To change this template file, choose Tools | Templates + * and open the template in the editor. + */ +package edu.harvard.iq.dataverse; + +import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; +import edu.harvard.iq.dataverse.authorization.users.ApiToken; +import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.authorization.users.User; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.externaltools.ExternalToolHandler; +import edu.harvard.iq.dataverse.util.BundleUtil; +import static edu.harvard.iq.dataverse.util.JsfHelper.JH; +import java.util.logging.Logger; +import javax.ejb.EJB; +import javax.faces.application.FacesMessage; +import javax.faces.view.ViewScoped; +import javax.inject.Inject; +import javax.inject.Named; + +/** + * This bean is mainly for keeping track of which file the user selected to run external tools on. + * Also for creating an alert across Dataset and DataFile page, and making it easy to get the file-specific handler for a tool. + * @author madunlap + */ + +@ViewScoped +@Named +public class ConfigureFragmentBean implements java.io.Serializable{ + + private static final Logger logger = Logger.getLogger(ConfigureFragmentBean.class.getName()); + + private ExternalTool tool = null; + private Long fileId = null; + private ExternalToolHandler toolHandler = null; + + @EJB + DataFileServiceBean datafileService; + @Inject + DataverseSession session; + @EJB + AuthenticationServiceBean authService; + + public String configureExternalAlert() { + JH.addMessage(FacesMessage.SEVERITY_WARN, tool.getDisplayName(), BundleUtil.getStringFromBundle("file.configure.launchMessage.details") + " " + tool.getDisplayName() + "."); + return ""; + } + + /** + * @param setTool the tool to set + */ + public void setConfigurePopupTool(ExternalTool setTool) { + tool = setTool; + } + + /** + * @return the Tool + */ + public ExternalTool getConfigurePopupTool() { + return tool; + } + + public ExternalToolHandler getConfigurePopupToolHandler() { + if(fileId == null) { + //on first UI load, method is called before fileId is set. There may be a better way to handle this + return null; + } + if(toolHandler != null) { + return toolHandler; + } + + datafileService.find(fileId); + + ApiToken apiToken = new ApiToken(); + User user = session.getUser(); + if (user instanceof AuthenticatedUser) { + apiToken = authService.findApiTokenByUser((AuthenticatedUser) user); + } + + toolHandler = new ExternalToolHandler(tool, datafileService.find(fileId), apiToken); + return toolHandler; + } + + public void setConfigureFileId(Long setFileId) + { + fileId = setFileId; + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/CustomQuestionResponse.java b/src/main/java/edu/harvard/iq/dataverse/CustomQuestionResponse.java index a9e03a300c8..68129e37502 100644 --- a/src/main/java/edu/harvard/iq/dataverse/CustomQuestionResponse.java +++ b/src/main/java/edu/harvard/iq/dataverse/CustomQuestionResponse.java @@ -103,5 +103,17 @@ public boolean equals(Object object) { public String toString() { return "edu.harvard.iq.dvn.core.vdc.CustomQuestionResponse[ id=" + id + " ]"; } + + @Transient private String validationMessage; + + public String getValidationMessage() { + return validationMessage; + } + + public void setValidationMessage(String validationMessage) { + this.validationMessage = validationMessage; + } + + } diff --git a/src/main/java/edu/harvard/iq/dataverse/DOIDataCiteRegisterService.java b/src/main/java/edu/harvard/iq/dataverse/DOIDataCiteRegisterService.java index 4224b565159..e058ecd4b83 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DOIDataCiteRegisterService.java +++ b/src/main/java/edu/harvard/iq/dataverse/DOIDataCiteRegisterService.java @@ -15,11 +15,9 @@ import java.util.List; import java.util.logging.Level; import java.util.logging.Logger; -import javax.annotation.PreDestroy; import javax.ejb.Stateless; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; -import javax.persistence.Query; import javax.persistence.TypedQuery; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; @@ -57,7 +55,7 @@ public String createIdentifier(String identifier, HashMap metada metadataTemplate.setPublisherYear(metadata.get("datacite.publicationyear")); String xmlMetadata = metadataTemplate.generateXML(); - logger.fine("XML to send to DataCite: " + xmlMetadata); + logger.log(Level.FINE, "XML to send to DataCite: {0}", xmlMetadata); String status = metadata.get("_status").trim(); String target = metadata.get("_target"); @@ -92,8 +90,14 @@ public String createIdentifier(String identifier, HashMap metada try (DataCiteRESTfullClient client = openClient()) { retString = client.postMetadata(xmlMetadata); client.postUrl(identifier.substring(identifier.indexOf(":") + 1), target); + } catch (UnsupportedEncodingException ex) { - Logger.getLogger(DOIDataCiteRegisterService.class.getName()).log(Level.SEVERE, null, ex); + logger.log(Level.SEVERE, null, ex); + + } catch ( RuntimeException rte ) { + logger.log(Level.SEVERE, "Error creating DOI at DataCite: {0}", rte.getMessage()); + logger.log(Level.SEVERE, "Exception", rte); + } } } else if (status.equals("unavailable")) { @@ -264,7 +268,7 @@ public String generateXML() { if (author.getIdType() != null && author.getIdValue() != null && !author.getIdType().isEmpty() && !author.getIdValue().isEmpty() && author.getAffiliation() != null && !author.getAffiliation().getDisplayValue().isEmpty()) { if (author.getIdType().equals("ORCID")) { - creatorsElement.append("" + author.getIdValue() + ""); + creatorsElement.append("" + author.getIdValue() + ""); } if (author.getIdType().equals("ISNI")) { creatorsElement.append("" + author.getIdValue() + ""); diff --git a/src/main/java/edu/harvard/iq/dataverse/DataCitation.java b/src/main/java/edu/harvard/iq/dataverse/DataCitation.java index c969d4c29a1..a30912ba8f5 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataCitation.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataCitation.java @@ -160,7 +160,7 @@ public String toString(boolean html) { citationList.add(year); citationList.add(formatString(title, html, "\"")); if (persistentId != null) { - citationList.add(formatURL(persistentId.toString(), persistentId.toURL().toString(), html)); + citationList.add(formatURL(persistentId.toURL().toString(), persistentId.toURL().toString(), html)); //always show url format } citationList.add(formatString(distributors, html)); citationList.add(version); diff --git a/src/main/java/edu/harvard/iq/dataverse/DataCiteRESTfullClient.java b/src/main/java/edu/harvard/iq/dataverse/DataCiteRESTfullClient.java index 93607a56541..a329f663fb5 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataCiteRESTfullClient.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataCiteRESTfullClient.java @@ -169,11 +169,11 @@ public boolean testDOIExists(String doi) { * @param metadata * @return */ - public String postMetadata(String metadata) throws UnsupportedEncodingException { + public String postMetadata(String metadata) { HttpPost httpPost = new HttpPost(this.url + "/metadata"); httpPost.setHeader("Content-Type", "application/xml;charset=UTF-8"); - httpPost.setEntity(new StringEntity(metadata, "utf-8")); try { + httpPost.setEntity(new StringEntity(metadata, "utf-8")); HttpResponse response = httpClient.execute(httpPost,context); String data = EntityUtils.toString(response.getEntity(), encoding); @@ -183,6 +183,7 @@ public String postMetadata(String metadata) throws UnsupportedEncodingException throw new RuntimeException(errMsg); } return data; + } catch (IOException ioe) { logger.log(Level.SEVERE, "IOException when post metadata"); throw new RuntimeException("IOException when post metadata", ioe); diff --git a/src/main/java/edu/harvard/iq/dataverse/DataFile.java b/src/main/java/edu/harvard/iq/dataverse/DataFile.java index fccdec8d68f..e7532c36cc2 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataFile.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataFile.java @@ -217,7 +217,7 @@ public DataFile(String contentType) { /** * All constructors should use this method - * to intitialize this file replace attributes + * to initialize this file replace attributes */ private void initFileReplaceAttributes(){ this.rootDataFileId = ROOT_DATAFILE_ID_DEFAULT; @@ -377,6 +377,7 @@ public String getIngestReportMessage() { } return "Ingest failed. No further information is available."; } + public boolean isTabularData() { return getDataTables() != null && getDataTables().size() > 0; } diff --git a/src/main/java/edu/harvard/iq/dataverse/Dataset.java b/src/main/java/edu/harvard/iq/dataverse/Dataset.java index 144285299ad..84b4a4934bc 100644 --- a/src/main/java/edu/harvard/iq/dataverse/Dataset.java +++ b/src/main/java/edu/harvard/iq/dataverse/Dataset.java @@ -9,9 +9,10 @@ import java.util.ArrayList; import java.util.Collection; import java.util.Date; +import java.util.HashSet; import java.util.List; import java.util.Objects; -import java.util.logging.Logger; +import java.util.Set; import javax.persistence.CascadeType; import javax.persistence.Column; import javax.persistence.Entity; @@ -73,7 +74,6 @@ sequence. Used when the Dataverse is (optionally) configured to use @Index(columnList = "thumbnailfile_id")}, uniqueConstraints = @UniqueConstraint(columnNames = {"authority,protocol,identifier,doiseparator"})) public class Dataset extends DvObjectContainer { - private static final Logger logger = Logger.getLogger(Dataset.class.getCanonicalName()); public static final String TARGET_URL = "/citation?persistentId="; private static final long serialVersionUID = 1L; @@ -100,8 +100,8 @@ public class Dataset extends DvObjectContainer { @OrderBy("versionNumber DESC, minorVersionNumber DESC") private List versions = new ArrayList<>(); - @OneToOne(mappedBy = "dataset", cascade = {CascadeType.REMOVE, CascadeType.MERGE, CascadeType.PERSIST}, orphanRemoval = true) - private DatasetLock datasetLock; + @OneToMany(mappedBy = "dataset", cascade = CascadeType.ALL, orphanRemoval = true) + private Set datasetLocks; @OneToOne(cascade = {CascadeType.MERGE, CascadeType.PERSIST}) @JoinColumn(name = "thumbnailfile_id") @@ -154,7 +154,63 @@ public Dataset() { datasetVersion.setMinorVersionNumber((long) 0); versions.add(datasetVersion); } + + /** + * Checks whether {@code this} dataset is locked for a given reason. + * @param reason the reason we test for. + * @return {@code true} iff the data set is locked for {@code reason}. + */ + public boolean isLockedFor( DatasetLock.Reason reason ) { + for ( DatasetLock l : getLocks() ) { + if ( l.getReason() == reason ) { + return true; + } + } + return false; + } + + /** + * Retrieves the dataset lock for the passed reason. + * @param reason + * @return the dataset lock, or {@code null}. + */ + public DatasetLock getLockFor( DatasetLock.Reason reason ) { + for ( DatasetLock l : getLocks() ) { + if ( l.getReason() == reason ) { + return l; + } + } + return null; + } + + public Set getLocks() { + // lazy set creation + if ( datasetLocks == null ) { + setLocks( new HashSet<>() ); + } + return datasetLocks; + } + /** + * JPA use only! + * @param datasetLocks + */ + void setLocks(Set datasetLocks) { + this.datasetLocks = datasetLocks; + } + + public void addLock(DatasetLock datasetLock) { + getLocks().add(datasetLock); + } + + public void removeLock( DatasetLock aDatasetLock ) { + getLocks().remove( aDatasetLock ); + } + + public boolean isLocked() { + return !getLocks().isEmpty(); + } + public String getProtocol() { return protocol; } @@ -240,18 +296,6 @@ public void setFiles(List files) { this.files = files; } - public DatasetLock getDatasetLock() { - return datasetLock; - } - - public void setDatasetLock(DatasetLock datasetLock) { - this.datasetLock = datasetLock; - } - - public boolean isLocked() { - return (getDatasetLock()!=null); - } - public boolean isDeaccessioned() { // return true, if all published versions were deaccessioned boolean hasDeaccessionedVersions = false; diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetField.java b/src/main/java/edu/harvard/iq/dataverse/DatasetField.java index 68b086e75e3..7bea9250279 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetField.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetField.java @@ -299,7 +299,10 @@ public List getValues_nondisplay() List returnList = new ArrayList(); if (!datasetFieldValues.isEmpty()) { for (DatasetFieldValue dsfv : datasetFieldValues) { - returnList.add(dsfv.getValue()); + String value = dsfv.getValue(); + if (value != null) { + returnList.add(value); + } } } else { for (ControlledVocabularyValue cvv : controlledVocabularyValues) { diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetLock.java b/src/main/java/edu/harvard/iq/dataverse/DatasetLock.java index 8e572cd3c39..3114ab6dc45 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetLock.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetLock.java @@ -20,6 +20,7 @@ package edu.harvard.iq.dataverse; +import static edu.harvard.iq.dataverse.DatasetLock.Reason.Workflow; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import java.util.Date; import java.io.Serializable; @@ -33,7 +34,6 @@ import javax.persistence.Index; import javax.persistence.JoinColumn; import javax.persistence.ManyToOne; -import javax.persistence.OneToOne; import javax.persistence.Table; import javax.persistence.Temporal; import javax.persistence.TemporalType; @@ -52,7 +52,7 @@ @Table(indexes = {@Index(columnList="user_id"), @Index(columnList="dataset_id")}) @NamedQueries( @NamedQuery(name="DatasetLock.getLocksByDatasetId", - query="SELECT l FROM DatasetLock l WHERE l.dataset.id=:datasetId") + query="SELECT lock FROM DatasetLock lock WHERE lock.dataset.id=:datasetId") ) public class DatasetLock implements Serializable { @@ -79,13 +79,13 @@ public enum Reason { @Temporal(value = TemporalType.TIMESTAMP) private Date startTime; - @OneToOne + @ManyToOne @JoinColumn(nullable=false) private Dataset dataset; @ManyToOne @JoinColumn(nullable=false) - private AuthenticatedUser user; + private AuthenticatedUser user; @Enumerated(EnumType.STRING) @Column(nullable=false) @@ -119,7 +119,7 @@ public DatasetLock(Reason aReason, AuthenticatedUser aUser, String infoMessage) startTime = new Date(); user = aUser; info = infoMessage; - + } /** diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java index 030553916d3..3fecb0e0ba5 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java @@ -77,13 +77,17 @@ import java.util.HashSet; import javax.faces.model.SelectItem; import java.util.logging.Level; -import edu.harvard.iq.dataverse.datasetutility.TwoRavensHelper; import edu.harvard.iq.dataverse.datasetutility.WorldMapPermissionHelper; +import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; +import edu.harvard.iq.dataverse.engine.command.impl.GetLatestPublishedDatasetVersionCommand; import edu.harvard.iq.dataverse.engine.command.impl.RequestRsyncScriptCommand; import edu.harvard.iq.dataverse.engine.command.impl.PublishDatasetResult; import edu.harvard.iq.dataverse.engine.command.impl.RestrictFileCommand; import edu.harvard.iq.dataverse.engine.command.impl.ReturnDatasetToAuthorCommand; import edu.harvard.iq.dataverse.engine.command.impl.SubmitDatasetForReviewCommand; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.externaltools.ExternalToolServiceBean; +import edu.harvard.iq.dataverse.export.SchemaDotOrgExporter; import java.util.Collections; import javax.faces.event.AjaxBehaviorEvent; @@ -167,6 +171,9 @@ public enum DisplayMode { DataverseRoleServiceBean dataverseRoleService; @EJB PrivateUrlServiceBean privateUrlService; + @EJB + ExternalToolServiceBean externalToolService; + @Inject DataverseRequestServiceBean dvRequestService; @Inject @@ -176,13 +183,12 @@ public enum DisplayMode { @Inject FileDownloadHelper fileDownloadHelper; @Inject - TwoRavensHelper twoRavensHelper; - @Inject WorldMapPermissionHelper worldMapPermissionHelper; @Inject ThumbnailServiceWrapper thumbnailServiceWrapper; @Inject SettingsWrapper settingsWrapper; + private Dataset dataset = new Dataset(); @@ -245,6 +251,11 @@ public enum DisplayMode { private Boolean hasRsyncScript = false; + List configureTools = new ArrayList<>(); + List exploreTools = new ArrayList<>(); + Map> configureToolsByFileId = new HashMap<>(); + Map> exploreToolsByFileId = new HashMap<>(); + public Boolean isHasRsyncScript() { return hasRsyncScript; } @@ -1419,6 +1430,7 @@ private String init(boolean initFull) { // populate MapLayerMetadata this.loadMapLayerMetadataLookup(); // A DataFile may have a related MapLayerMetadata object this.guestbookResponse = guestbookResponseService.initGuestbookResponseForFragment(dataset, null, session); + this.getFileDownloadHelper().setGuestbookResponse(guestbookResponse); logger.fine("Checking if rsync support is enabled."); if (DataCaptureModuleUtil.rsyncSupportEnabled(settingsWrapper.getValueForKey(SettingsServiceBean.Key.UploadMethods))) { try { @@ -1503,15 +1515,23 @@ private String init(boolean initFull) { // Various info messages, when the dataset is locked (for various reasons): if (dataset.isLocked()) { - if (dataset.getDatasetLock().getReason().equals(DatasetLock.Reason.DcmUpload)) { - JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.summary"), BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.details")); - } else if (dataset.getDatasetLock().getReason().equals(DatasetLock.Reason.Workflow)) { - JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("dataset.locked.message"), BundleUtil.getStringFromBundle("dataset.publish.workflow.inprogress")); - } else if (dataset.getDatasetLock().getReason().equals(DatasetLock.Reason.InReview)) { - JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("dataset.locked.message"), BundleUtil.getStringFromBundle("dataset.inreview.infoMessage")); + if (dataset.isLockedFor(DatasetLock.Reason.Workflow)) { + JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("dataset.locked.message"), + BundleUtil.getStringFromBundle("dataset.publish.workflow.inprogress")); + } + if (dataset.isLockedFor(DatasetLock.Reason.InReview)) { + JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("dataset.locked.inReview.message"), + BundleUtil.getStringFromBundle("dataset.inreview.infoMessage")); + } + if (dataset.isLockedFor(DatasetLock.Reason.DcmUpload)) { + JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.summary"), + BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.details")); } } - + + configureTools = externalToolService.findByType(ExternalTool.Type.CONFIGURE); + exploreTools = externalToolService.findByType(ExternalTool.Type.EXPLORE); + return null; } @@ -1862,7 +1882,7 @@ private String releaseDataset(boolean minor) { // has been published. If a publishing workflow is configured, this may have sent the // dataset into a workflow limbo, potentially waiting for a third party system to complete // the process. So it may be premature to show the "success" message at this point. - if (dataset.isLocked() && dataset.getDatasetLock().getReason().equals(DatasetLock.Reason.Workflow)) { + if (dataset.isLockedFor(DatasetLock.Reason.Workflow)) { JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("dataset.locked.message"), BundleUtil.getStringFromBundle("dataset.publish.workflow.inprogress")); } else { JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("dataset.message.publishSuccess")); @@ -2060,8 +2080,16 @@ public void validateFilesForDownload(boolean guestbookRequired){ requestContext.execute("PF('selectFilesForDownload').show()"); return; } + + List allFiles = new ArrayList<>(); - + if (isSelectAllFiles()){ + for (FileMetadata fm: workingVersion.getFileMetadatas()){ + allFiles.add(fm); + } + this.selectedFiles = allFiles; + } + for (FileMetadata fmd : this.selectedFiles){ if(this.fileDownloadHelper.canDownloadFile(fmd)){ getSelectedDownloadableFiles().add(fmd); @@ -2090,7 +2118,7 @@ public void validateFilesForDownload(boolean guestbookRequired){ } } - + private boolean selectAllFiles; public boolean isSelectAllFiles() { @@ -2100,32 +2128,13 @@ public boolean isSelectAllFiles() { public void setSelectAllFiles(boolean selectAllFiles) { this.selectAllFiles = selectAllFiles; } - - public void toggleSelectedFiles(){ - //method for when user clicks (de-)select all files - this.selectedFiles = new ArrayList<>(); - if(this.selectAllFiles){ - for (FileMetadata fmd : workingVersion.getFileMetadatas()) { - this.selectedFiles.add(fmd); - fmd.setSelected(true); - } - } else { - for (FileMetadata fmd : workingVersion.getFileMetadatas()) { - fmd.setSelected(false); - } - } - updateFileCounts(); + + public void toggleAllSelected(){ + //This is here so that if the user selects all on the dataset page + // s/he will get all files on download + this.selectAllFiles = !this.selectAllFiles; } - - public void updateSelectedFiles(FileMetadata fmd){ - if(fmd.isSelected()){ - this.selectedFiles.add(fmd); - } else{ - this.selectedFiles.remove(fmd); - } - updateFileCounts(); - } // helper Method public String getSelectedFilesIdsString() { @@ -2615,12 +2624,26 @@ public void refreshLock() { //requestContext.execute("refreshPage();"); } } + + public void refreshIngestLock() { + //RequestContext requestContext = RequestContext.getCurrentInstance(); + logger.fine("checking ingest lock"); + if (isStillLockedForIngest()) { + logger.fine("(still locked)"); + } else { + // OK, the dataset is no longer locked. + // let's tell the page to refresh: + logger.fine("no longer locked!"); + stateChanged = true; + //requestContext.execute("refreshPage();"); + } + } /* public boolean isLockedInProgress() { if (dataset != null) { - logger.fine("checking lock status of dataset " + dataset.getId()); + logger.log(Level.FINE, "checking lock status of dataset {0}", dataset.getId()); if (dataset.isLocked()) { return true; } @@ -2629,19 +2652,18 @@ public boolean isLockedInProgress() { }*/ public boolean isDatasetLockedInWorkflow() { - if (dataset != null) { - if (dataset.isLocked()) { - if (dataset.getDatasetLock().getReason().equals(DatasetLock.Reason.Workflow)) { - return true; - } - } - } - return false; + return (dataset != null) + ? dataset.isLockedFor(DatasetLock.Reason.Workflow) + : false; } public boolean isStillLocked() { + if (dataset != null && dataset.getId() != null) { - logger.fine("checking lock status of dataset " + dataset.getId()); + logger.log(Level.FINE, "checking lock status of dataset {0}", dataset.getId()); + if(dataset.getLocks().size() == 1 && dataset.getLockFor(DatasetLock.Reason.InReview) != null){ + return false; + } if (datasetService.checkDatasetLock(dataset.getId())) { return true; } @@ -2649,6 +2671,21 @@ public boolean isStillLocked() { return false; } + + public boolean isStillLockedForIngest() { + if (dataset.getId() != null) { + Dataset testDataset = datasetService.find(dataset.getId()); + if (testDataset != null && testDataset.getId() != null) { + logger.log(Level.FINE, "checking lock status of dataset {0}", dataset.getId()); + + if (testDataset.getLockFor(DatasetLock.Reason.Ingest) != null) { + return true; + } + } + } + return false; + } + public boolean isLocked() { if (stateChanged) { return false; @@ -2662,11 +2699,63 @@ public boolean isLocked() { return false; } + public boolean isLockedForIngest() { + if (dataset.getId() != null) { + Dataset testDataset = datasetService.find(dataset.getId()); + if (stateChanged) { + return false; + } + + if (testDataset != null) { + if (testDataset.getLockFor(DatasetLock.Reason.Ingest) != null) { + return true; + } + } + } + return false; + } + + private Boolean lockedFromEditsVar; + private Boolean lockedFromDownloadVar; + /** + * Authors are not allowed to edit but curators are allowed - when Dataset is inReview + * For all other locks edit should be locked for all editors. + */ + public boolean isLockedFromEdits() { + if(null == lockedFromEditsVar) { + try { + permissionService.checkEditDatasetLock(dataset, dvRequestService.getDataverseRequest(), new UpdateDatasetCommand(dataset, dvRequestService.getDataverseRequest())); + lockedFromEditsVar = false; + } catch (IllegalCommandException ex) { + lockedFromEditsVar = true; + } + } + return lockedFromEditsVar; + } + + public boolean isLockedFromDownload(){ + if(null == lockedFromDownloadVar) { + try { + permissionService.checkDownloadFileLock(dataset, dvRequestService.getDataverseRequest(), new CreateDatasetCommand(dataset, dvRequestService.getDataverseRequest())); + lockedFromDownloadVar = false; + } catch (IllegalCommandException ex) { + lockedFromDownloadVar = true; + return true; + } + } + return lockedFromDownloadVar; + } + public void setLocked(boolean locked) { // empty method, so that we can use DatasetPage.locked in a hidden // input on the page. } + public void setLockedForIngest(boolean locked) { + // empty method, so that we can use DatasetPage.locked in a hidden + // input on the page. + } + public boolean isStateChanged() { return stateChanged; } @@ -3838,14 +3927,6 @@ public void setWorldMapPermissionHelper(WorldMapPermissionHelper worldMapPermiss this.worldMapPermissionHelper = worldMapPermissionHelper; } - public TwoRavensHelper getTwoRavensHelper() { - return twoRavensHelper; - } - - public void setTwoRavensHelper(TwoRavensHelper twoRavensHelper) { - this.twoRavensHelper = twoRavensHelper; - } - /** * dataset title * @return title of workingVersion @@ -3864,23 +3945,6 @@ public String getDescription() { return workingVersion.getDescriptionPlainText(); } - /** - * dataset publication date unpublished datasets will return an empty - * string. - * - * @return String dataset publication date (dd MMM yyyy). - */ - public String getPublicationDate() { - assert (null != workingVersion); - if (DatasetVersion.VersionState.DRAFT == workingVersion.getVersionState()) { - return ""; - } - Date rel_date = workingVersion.getReleaseTime(); - SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd"); - String r = fmt.format(rel_date.getTime()); - return r; - } - /** * dataset authors * @@ -3891,16 +3955,6 @@ public List getDatasetAuthors() { return workingVersion.getDatasetAuthorNames(); } - /** - * dataset subjects - * - * @return array of String containing the subjects for a page - */ - public List getDatasetSubjects() { - assert (null != workingVersion); - return workingVersion.getDatasetSubjects(); - } - /** * publisher (aka - name of root dataverse) * @@ -3949,9 +4003,9 @@ public void downloadRsyncScript() { String lockInfoMessage = "script downloaded"; DatasetLock lock = datasetService.addDatasetLock(dataset.getId(), DatasetLock.Reason.DcmUpload, session.getUser() != null ? ((AuthenticatedUser)session.getUser()).getId() : null, lockInfoMessage); if (lock != null) { - dataset.setDatasetLock(lock); + dataset.addLock(lock); } else { - logger.warning("Failed to lock the dataset (dataset id="+dataset.getId()+")"); + logger.log(Level.WARNING, "Failed to lock the dataset (dataset id={0})", dataset.getId()); } } @@ -3978,11 +4032,85 @@ public String finishRsyncScriptAction() { * It returns the default summary fields( subject, description, keywords, related publications and notes) * if the custom summary datafields has not been set, otherwise will set the custom fields set by the sysadmins * + * @return the dataset fields to be shown in the dataset summary */ public List getDatasetSummaryFields() { customFields = settingsWrapper.getValueForKey(SettingsServiceBean.Key.CustomDatasetSummaryFields); return DatasetUtil.getDatasetSummaryFields(workingVersion, customFields); } + + public List getConfigureToolsForDataFile(Long fileId) { + return getCachedToolsForDataFile(fileId, ExternalTool.Type.CONFIGURE); + } + + public List getExploreToolsForDataFile(Long fileId) { + return getCachedToolsForDataFile(fileId, ExternalTool.Type.EXPLORE); + } + + public List getCachedToolsForDataFile(Long fileId, ExternalTool.Type type) { + Map> cachedToolsByFileId = new HashMap<>(); + List externalTools = new ArrayList<>(); + switch (type) { + case EXPLORE: + cachedToolsByFileId = exploreToolsByFileId; + externalTools = exploreTools; + break; + case CONFIGURE: + cachedToolsByFileId = configureToolsByFileId; + externalTools = configureTools; + break; + default: + break; + } + List cachedTools = cachedToolsByFileId.get(fileId); + if (cachedTools != null) { //if already queried before and added to list + return cachedTools; + } + DataFile dataFile = datafileService.find(fileId); + cachedTools = ExternalToolServiceBean.findExternalToolsByFile(externalTools, dataFile); + cachedToolsByFileId.put(fileId, cachedTools); //add to map so we don't have to do the lifting again + return cachedTools; + } + + Boolean thisLatestReleasedVersion = null; + public boolean isThisLatestReleasedVersion() { + if (thisLatestReleasedVersion != null) { + return thisLatestReleasedVersion; + } + + if (!workingVersion.isPublished()) { + thisLatestReleasedVersion = false; + return false; + } + + DatasetVersion latestPublishedVersion = null; + Command cmd = new GetLatestPublishedDatasetVersionCommand(dvRequestService.getDataverseRequest(), dataset); + try { + latestPublishedVersion = commandEngine.submit(cmd); + } catch (Exception ex) { + // whatever... + } + + thisLatestReleasedVersion = workingVersion.equals(latestPublishedVersion); + + return thisLatestReleasedVersion; + + } + + public String getJsonLd() { + if (isThisLatestReleasedVersion()) { + ExportService instance = ExportService.getInstance(settingsService); + String jsonLd = instance.getExportAsString(dataset, SchemaDotOrgExporter.NAME); + if (jsonLd != null) { + logger.fine("Returning cached schema.org JSON-LD."); + return jsonLd; + } else { + logger.fine("No cached schema.org JSON-LD available. Going to the database."); + return workingVersion.getJsonLd(); + } + } + return ""; + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java index da24256fea7..d10bb7912a3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetServiceBean.java @@ -1,8 +1,3 @@ -/* - * To change this license header, choose License Headers in Project Properties. - * To change this template file, choose Tools | Templates - * and open the template in the editor. - */ package edu.harvard.iq.dataverse; import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; @@ -24,6 +19,7 @@ import java.util.ArrayList; import java.util.Date; import java.util.HashMap; +import java.util.HashSet; import java.util.List; import java.util.Map; import java.util.Set; @@ -36,14 +32,10 @@ import javax.ejb.Stateless; import javax.ejb.TransactionAttribute; import javax.ejb.TransactionAttributeType; -import javax.inject.Inject; import javax.inject.Named; import javax.persistence.EntityManager; -import javax.persistence.NamedStoredProcedureQuery; -import javax.persistence.ParameterMode; import javax.persistence.PersistenceContext; import javax.persistence.Query; -import javax.persistence.StoredProcedureParameter; import javax.persistence.StoredProcedureQuery; import javax.persistence.TypedQuery; import javax.xml.stream.XMLOutputFactory; @@ -506,24 +498,13 @@ public boolean checkDatasetLock(Long datasetId) { return lock.size()>0; } - public String checkDatasetLockInfo(Long datasetId) { - String nativeQuery = "SELECT sl.info FROM DatasetLock sl WHERE sl.dataset_id = " + datasetId + " LIMIT 1;"; - String infoMessage; - try { - infoMessage = (String)em.createNativeQuery(nativeQuery).getSingleResult(); - } catch (Exception ex) { - infoMessage = null; - } - - return infoMessage; - } - @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) - public DatasetLock addDatasetLock(Dataset dataset, DatasetLock lock) { - dataset.setDatasetLock(lock); - em.persist(lock); - return lock; + lock.setDataset(dataset); + dataset.addLock(lock); + em.persist(lock); + em.merge(dataset); + return lock; } @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) /*?*/ @@ -552,29 +533,25 @@ public DatasetLock addDatasetLock(Long datasetId, DatasetLock.Reason reason, Lon return addDatasetLock(dataset, lock); } + /** + * Removes all {@link DatasetLock}s for the dataset whose id is passed and reason + * is {@code aReason}. + * @param datasetId Id of the dataset whose locks will b removed. + * @param aReason The reason of the locks that will be removed. + */ @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) - public void removeDatasetLock(Long datasetId) { + public void removeDatasetLocks(Long datasetId, DatasetLock.Reason aReason) { Dataset dataset = em.find(Dataset.class, datasetId); - //em.refresh(dataset); (?) - DatasetLock lock = dataset.getDatasetLock(); - if (lock != null) { - AuthenticatedUser user = lock.getUser(); - dataset.setDatasetLock(null); - user.getDatasetLocks().remove(lock); - /* - * TODO - ? - * throw an exception if for whatever reason we can't remove the lock? - try { - */ - em.remove(lock); - /* - } catch (TransactionRequiredException te) { - ... - } catch (IllegalArgumentException iae) { - ... - } - */ - } + new HashSet<>(dataset.getLocks()).stream() + .filter( l -> l.getReason() == aReason ) + .forEach( lock -> { + dataset.removeLock(lock); + + AuthenticatedUser user = lock.getUser(); + user.getDatasetLocks().remove(lock); + + em.remove(lock); + }); } /* @@ -611,7 +588,7 @@ public String getTitleFromLatestVersion(Long datasetId, boolean includeDraft){ + ";").getSingleResult(); } catch (Exception ex) { - logger.info("exception trying to get title from latest version: " + ex); + logger.log(Level.INFO, "exception trying to get title from latest version: {0}", ex); return ""; } diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java index 9e97e8d475a..ca5791786a7 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java @@ -3,8 +3,10 @@ import edu.harvard.iq.dataverse.util.MarkupChecker; import edu.harvard.iq.dataverse.DatasetFieldType.FieldType; import edu.harvard.iq.dataverse.util.StringUtil; +import edu.harvard.iq.dataverse.util.SystemConfig; import edu.harvard.iq.dataverse.workflows.WorkflowComment; import java.io.Serializable; +import java.math.BigDecimal; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; @@ -17,6 +19,9 @@ import java.util.Set; import java.util.logging.Level; import java.util.logging.Logger; +import javax.json.Json; +import javax.json.JsonArrayBuilder; +import javax.json.JsonObjectBuilder; import javax.persistence.CascadeType; import javax.persistence.Column; import javax.persistence.Entity; @@ -142,6 +147,9 @@ public enum License { @Transient private String contributorNames; + + @Transient + private String jsonLd; @OneToMany(mappedBy="datasetVersion", cascade={CascadeType.REMOVE, CascadeType.MERGE, CascadeType.PERSIST}) private List datasetVersionUsers; @@ -221,8 +229,7 @@ public void setDatasetFields(List datasetFields) { */ public boolean isInReview() { if (versionState != null && versionState.equals(VersionState.DRAFT)) { - DatasetLock l = getDataset().getDatasetLock(); - return (l != null) && l.getReason()==DatasetLock.Reason.InReview; + return getDataset().isLockedFor(DatasetLock.Reason.InReview); } else { return false; } @@ -418,6 +425,10 @@ public boolean isReleased() { return versionState.equals(VersionState.RELEASED); } + public boolean isPublished() { + return isReleased(); + } + public boolean isDraft() { return versionState.equals(VersionState.DRAFT); } @@ -707,6 +718,42 @@ public List getDatasetAuthors() { return retList; } + public List getTimePeriodsCovered() { + List retList = new ArrayList<>(); + for (DatasetField dsf : this.getDatasetFields()) { + if (dsf.getDatasetFieldType().getName().equals(DatasetFieldConstant.timePeriodCovered)) { + for (DatasetFieldCompoundValue timePeriodValue : dsf.getDatasetFieldCompoundValues()) { + String start = ""; + String end = ""; + for (DatasetField subField : timePeriodValue.getChildDatasetFields()) { + if (subField.getDatasetFieldType().getName().equals(DatasetFieldConstant.timePeriodCoveredStart)) { + if (subField.isEmptyForDisplay()) { + start = null; + } else { + // we want to use "getValue()", as opposed to "getDisplayValue()" here - + // as the latter method prepends the value with the word "Start:"! + start = subField.getValue(); + } + } + if (subField.getDatasetFieldType().getName().equals(DatasetFieldConstant.timePeriodCoveredEnd)) { + if (subField.isEmptyForDisplay()) { + end = null; + } else { + // see the comment above + end = subField.getValue(); + } + } + + } + if (start != null && end != null) { + retList.add(start + "/" + end); + } + } + } + } + return retList; + } + /** * @return List of Strings containing the names of the authors. */ @@ -730,7 +777,55 @@ public List getDatasetSubjects() { } return subjects; } - + + /** + * @return List of Strings containing the version's Topic Classifications + */ + public List getTopicClassifications() { + return getCompoundChildFieldValues(DatasetFieldConstant.topicClassification, DatasetFieldConstant.topicClassValue); + } + + /** + * @return List of Strings containing the version's Keywords + */ + public List getKeywords() { + return getCompoundChildFieldValues(DatasetFieldConstant.keyword, DatasetFieldConstant.keywordValue); + } + + /** + * @return List of Strings containing the version's PublicationCitations + */ + public List getPublicationCitationValues() { + return getCompoundChildFieldValues(DatasetFieldConstant.publication, DatasetFieldConstant.publicationCitation); + } + + /** + * @param parentFieldName compound dataset field A (from DatasetFieldConstant.*) + * @param childFieldName dataset field B, child field of A (from DatasetFieldConstant.*) + * @return List of values of the child field + */ + public List getCompoundChildFieldValues(String parentFieldName, String childFieldName) { + List keywords = new ArrayList<>(); + for (DatasetField dsf : this.getDatasetFields()) { + if (dsf.getDatasetFieldType().getName().equals(parentFieldName)) { + for (DatasetFieldCompoundValue keywordFieldValue : dsf.getDatasetFieldCompoundValues()) { + for (DatasetField subField : keywordFieldValue.getChildDatasetFields()) { + if (subField.getDatasetFieldType().getName().equals(childFieldName)) { + String keyword = subField.getValue(); + // Field values should NOT be empty or, especially, null, + // - in the ideal world. But as we are realizing, they CAN + // be null in real life databases. So, a check, just in case: + if (!StringUtil.isEmpty(keyword)) { + keywords.add(subField.getValue()); + } + } + } + } + } + } + return keywords; + } + public String getDatasetProducersString(){ String retVal = ""; for (DatasetField dsf : this.getDatasetFields()) { @@ -1100,4 +1195,178 @@ public List getWorkflowComments() { return workflowComments; } + /** + * dataset publication date unpublished datasets will return an empty + * string. + * + * @return String dataset publication date in ISO 8601 format (yyyy-MM-dd). + */ + public String getPublicationDateAsString() { + if (DatasetVersion.VersionState.DRAFT == this.getVersionState()) { + return ""; + } + Date rel_date = this.getReleaseTime(); + SimpleDateFormat fmt = new SimpleDateFormat("yyyy-MM-dd"); + String r = fmt.format(rel_date.getTime()); + return r; + } + + // TODO: Consider moving this comment into the Exporter code. + // The export subsystem assumes there is only + // one metadata export in a given format per dataset (it uses the current + // released (published) version. This JSON fragment is generated for a + // specific released version - and we can have multiple released versions. + // So something will need to be modified to accommodate this. -- L.A. + + public String getJsonLd() { + // We show published datasets only for "datePublished" field below. + if (!this.isPublished()) { + return ""; + } + + if (jsonLd != null) { + return jsonLd; + } + JsonObjectBuilder job = Json.createObjectBuilder(); + job.add("@context", "http://schema.org"); + job.add("@type", "Dataset"); + job.add("identifier", this.getDataset().getPersistentURL()); + job.add("name", this.getTitle()); + JsonArrayBuilder authors = Json.createArrayBuilder(); + for (DatasetAuthor datasetAuthor : this.getDatasetAuthors()) { + JsonObjectBuilder author = Json.createObjectBuilder(); + String name = datasetAuthor.getName().getValue(); + DatasetField authorAffiliation = datasetAuthor.getAffiliation(); + String affiliation = null; + if (authorAffiliation != null) { + affiliation = datasetAuthor.getAffiliation().getValue(); + } + // We are aware of "givenName" and "familyName" but instead of a person it might be an organization such as "Gallup Organization". + //author.add("@type", "Person"); + author.add("name", name); + // We are aware that the following error is thrown by https://search.google.com/structured-data/testing-tool + // "The property affiliation is not recognized by Google for an object of type Thing." + // Someone at Google has said this is ok. + // This logic could be moved into the `if (authorAffiliation != null)` block above. + if (!StringUtil.isEmpty(affiliation)) { + author.add("affiliation", affiliation); + } + authors.add(author); + } + job.add("author", authors); + /** + * We are aware that there is a "datePublished" field but it means "Date + * of first broadcast/publication." This only makes sense for a 1.0 + * version. + */ + String datePublished = this.getDataset().getPublicationDateFormattedYYYYMMDD(); + if (datePublished != null) { + job.add("datePublished", datePublished); + } + + /** + * "dateModified" is more appropriate for a version: "The date on which + * the CreativeWork was most recently modified or when the item's entry + * was modified within a DataFeed." + */ + job.add("dateModified", this.getPublicationDateAsString()); + job.add("version", this.getVersionNumber().toString()); + job.add("description", this.getDescriptionPlainText()); + /** + * "keywords" - contains subject(s), datasetkeyword(s) and topicclassification(s) + * metadata fields for the version. -- L.A. + * (see #2243 for details/discussion/feedback from Google) + */ + JsonArrayBuilder keywords = Json.createArrayBuilder(); + + for (String subject : this.getDatasetSubjects()) { + keywords.add(subject); + } + + for (String topic : this.getTopicClassifications()) { + keywords.add(topic); + } + + for (String keyword : this.getKeywords()) { + keywords.add(keyword); + } + + job.add("keywords", keywords); + + /** + * citation: + * (multiple) publicationCitation values, if present: + */ + + List publicationCitations = getPublicationCitationValues(); + if (publicationCitations.size() > 0) { + JsonArrayBuilder citation = Json.createArrayBuilder(); + for (String pubCitation : publicationCitations) { + //citationEntry.add("@type", "Dataset"); + //citationEntry.add("text", pubCitation); + citation.add(pubCitation); + } + job.add("citation", citation); + } + + /** + * temporalCoverage: + * (if available) + */ + + List timePeriodsCovered = this.getTimePeriodsCovered(); + if (timePeriodsCovered.size() > 0) { + JsonArrayBuilder temporalCoverage = Json.createArrayBuilder(); + for (String timePeriod : timePeriodsCovered) { + temporalCoverage.add(timePeriod); + } + job.add("temporalCoverage", temporalCoverage); + } + + /** + * spatialCoverage (if available) + * TODO + * (punted, for now - see #2243) + * + */ + + /** + * funder (if available) + * TODO + * (punted, for now - see #2243) + */ + + job.add("schemaVersion", "https://schema.org/version/3.3"); + + TermsOfUseAndAccess terms = this.getTermsOfUseAndAccess(); + if (terms != null) { + JsonObjectBuilder license = Json.createObjectBuilder().add("@type", "Dataset"); + + if (TermsOfUseAndAccess.License.CC0.equals(terms.getLicense())) { + license.add("text", "CC0").add("url", "https://creativecommons.org/publicdomain/zero/1.0/"); + } else { + String termsOfUse = terms.getTermsOfUse(); + // Terms of use can be null if you create the dataset with JSON. + if (termsOfUse != null) { + license.add("text", termsOfUse); + } + } + + job.add("license",license); + } + + job.add("includedInDataCatalog", Json.createObjectBuilder() + .add("@type", "DataCatalog") + .add("name", this.getRootDataverseNameforCitation()) + .add("url", SystemConfig.getDataverseSiteUrlStatic()) + ); + + job.add("provider", Json.createObjectBuilder() + .add("@type", "Organization") + .add("name", "Dataverse") + ); + jsonLd = job.build().toString(); + return jsonLd; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/Dataverse.java b/src/main/java/edu/harvard/iq/dataverse/Dataverse.java index af009ec2063..df656bb61b1 100644 --- a/src/main/java/edu/harvard/iq/dataverse/Dataverse.java +++ b/src/main/java/edu/harvard/iq/dataverse/Dataverse.java @@ -162,7 +162,8 @@ public void setDefaultContributorRole(DataverseRole defaultContributorRole) { private boolean metadataBlockRoot; private boolean facetRoot; - private boolean themeRoot; + // By default, themeRoot should be true, as new dataverses should start with the default theme + private boolean themeRoot = true; private boolean templateRoot; diff --git a/src/main/java/edu/harvard/iq/dataverse/DataverseHeaderFragment.java b/src/main/java/edu/harvard/iq/dataverse/DataverseHeaderFragment.java index d1a2e7ab9dd..dfe6e5e70c9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataverseHeaderFragment.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataverseHeaderFragment.java @@ -223,11 +223,11 @@ public String logout() { redirectPage = URLDecoder.decode(redirectPage, "UTF-8"); } catch (UnsupportedEncodingException ex) { Logger.getLogger(LoginPage.class.getName()).log(Level.SEVERE, null, ex); - redirectPage = "dataverse.xhtml&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = redirectToRoot(); } if (StringUtils.isEmpty(redirectPage)) { - redirectPage = "dataverse.xhtml&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = redirectToRoot(); } logger.log(Level.INFO, "Sending user to = " + redirectPage); @@ -236,6 +236,10 @@ public String logout() { private Boolean signupAllowed = null; + private String redirectToRoot(){ + return "dataverse.xhtml?alias=" + dataverseService.findRootDataverse().getAlias(); + } + public boolean isSignupAllowed() { if (signupAllowed != null) { return signupAllowed; @@ -245,6 +249,18 @@ public boolean isSignupAllowed() { return signupAllowed; } + public boolean isRootDataverseThemeDisabled(Dataverse dataverse) { + if (dataverse == null) { + return false; + } + if (dataverse.getOwner() == null) { + // We're operating on the root dataverse. + return settingsWrapper.isRootDataverseThemeDisabled(); + } else { + return false; + } + } + public String getSignupUrl(String loginRedirect) { String nonNullDefaultIfKeyNotFound = ""; String signUpUrl = settingsWrapper.getValueForKey(SettingsServiceBean.Key.SignUpUrl, nonNullDefaultIfKeyNotFound); diff --git a/src/main/java/edu/harvard/iq/dataverse/DataversePage.java b/src/main/java/edu/harvard/iq/dataverse/DataversePage.java index c10bc5724a0..07a21c8c557 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataversePage.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataversePage.java @@ -639,7 +639,7 @@ public String save() { JsfHelper.addSuccessMessage(message); editMode = null; - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } catch (CommandException ex) { @@ -732,7 +732,7 @@ public String saveLinkedDataverse() { String msg = "Only authenticated users can link a dataverse."; logger.severe(msg); JsfHelper.addErrorMessage(msg); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } linkingDataverse = dataverseService.find(linkingDataverseId); @@ -745,7 +745,7 @@ public String saveLinkedDataverse() { String msg = "Unable to link " + dataverse.getDisplayName() + " to " + linkingDataverse.getDisplayName() + ". An internal error occurred."; logger.log(Level.SEVERE, "{0} {1}", new Object[]{msg, ex}); JsfHelper.addErrorMessage(msg); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } SavedSearch savedSearchOfChildren = createSavedSearchForChildren(savedSearchCreator); @@ -758,20 +758,20 @@ public String saveLinkedDataverse() { DataverseRequest dataverseRequest = new DataverseRequest(savedSearchCreator, SavedSearchServiceBean.getHttpServletRequest()); savedSearchService.makeLinksForSingleSavedSearch(dataverseRequest, savedSearchOfChildren, debug); JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("dataverse.linked.success", getSuccessMessageArguments())); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } catch (SearchException | CommandException ex) { // error: solr is down, etc. can't link children right now JsfHelper.addErrorMessage(BundleUtil.getStringFromBundle("dataverse.linked.internalerror", getSuccessMessageArguments())); String msg = dataverse.getDisplayName() + " has been successfully linked to " + linkingDataverse.getDisplayName() + " but contents will not appear until an internal error has been fixed."; logger.log(Level.SEVERE, "{0} {1}", new Object[]{msg, ex}); //JsfHelper.addErrorMessage(msg); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } } else { // defer: please wait for the next timer/cron job //JsfHelper.addSuccessMessage(dataverse.getDisplayName() + " has been successfully linked to " + linkingDataverse.getDisplayName() + ". Please wait for its contents to appear."); JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("dataverse.linked.success.wait", getSuccessMessageArguments())); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } } @@ -819,7 +819,7 @@ public String saveSavedSearch() { String msg = "Only authenticated users can save a search."; logger.severe(msg); JsfHelper.addErrorMessage(msg); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } SavedSearch savedSearch = new SavedSearch(searchIncludeFragment.getQuery(), linkingDataverse, savedSearchCreator); @@ -843,12 +843,12 @@ public String saveSavedSearch() { arguments.add(linkString); String successMessageString = BundleUtil.getStringFromBundle("dataverse.saved.search.success", arguments); JsfHelper.addSuccessMessage(successMessageString); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } catch (CommandException ex) { String msg = "There was a problem linking this search to yours: " + ex; logger.severe(msg); JsfHelper.addErrorMessage(BundleUtil.getStringFromBundle("dataverse.saved.search.failure") + " " + ex); - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } } @@ -876,7 +876,7 @@ public String releaseDataverse() { } else { JsfHelper.addErrorMessage(BundleUtil.getStringFromBundle("dataverse.publish.not.authorized")); } - return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + return returnRedirect(); } @@ -1016,5 +1016,9 @@ public void validateAlias(FacesContext context, UIComponent toValidate, Object v } } } + + private String returnRedirect(){ + return "/dataverse.xhtml?alias=" + dataverse.getAlias() + "&faces-redirect=true"; + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/DataverseServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DataverseServiceBean.java index 8adc5832040..026af897c6b 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataverseServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataverseServiceBean.java @@ -468,7 +468,40 @@ public Map getAllHarvestedDataverseDescriptions(){ return ret; }*/ + + public String getParentAliasString(SolrSearchResult solrSearchResult){ + Long dvId = solrSearchResult.getEntityId(); + String retVal = ""; + + if (dvId == null) { + return retVal; + } + + String searchResult; + try { + System.out.print("select t0.ALIAS FROM DATAVERSE t0, DVOBJECT t1, DVOBJECT t2 WHERE (t0.ID = t1.ID) AND (t2.OWNER_ID = t1.ID) AND (t2.ID =" + dvId + ")"); + searchResult = (String) em.createNativeQuery("select t0.ALIAS FROM DATAVERSE t0, DVOBJECT t1, DVOBJECT t2 WHERE (t0.ID = t1.ID) AND (t2.OWNER_ID = t1.ID) AND (t2.ID =" + dvId + ")").getSingleResult(); + + } catch (Exception ex) { + System.out.print("catching exception"); + System.out.print("catching exception" + ex.getMessage()); + return retVal; + } + if (searchResult == null) { + System.out.print("searchResult == null"); + return retVal; + } + + if (searchResult != null) { + System.out.print(searchResult); + return searchResult; + } + + return retVal; + } + + public void populateDvSearchCard(SolrSearchResult solrSearchResult) { Long dvId = solrSearchResult.getEntityId(); diff --git a/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java b/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java index 562a60f6e21..8c5aa3b0414 100644 --- a/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java @@ -2084,19 +2084,16 @@ private boolean isFileAlreadyUploaded(DataFile dataFile) { public boolean isLocked() { if (dataset != null) { - logger.fine("checking lock status of dataset " + dataset.getId()); + logger.log(Level.FINE, "checking lock status of dataset {0}", dataset.getId()); if (dataset.isLocked()) { // refresh the dataset and version, if the current working // version of the dataset is locked: } Dataset lookedupDataset = datasetService.find(dataset.getId()); - DatasetLock datasetLock = null; - if (lookedupDataset != null) { - datasetLock = lookedupDataset.getDatasetLock(); - if (datasetLock != null) { - logger.fine("locked!"); - return true; - } + + if ( (lookedupDataset!=null) && lookedupDataset.isLocked() ) { + logger.fine("locked!"); + return true; } } return false; @@ -2127,12 +2124,12 @@ public void setFileMetadataSelected(FileMetadata fm){ public void setFileMetadataSelected(FileMetadata fm, String guestbook) { fileMetadataSelected = fm; - logger.fine("set the file for the advanced options popup (" + fileMetadataSelected.getLabel() + ")"); + logger.log(Level.FINE, "set the file for the advanced options popup ({0})", fileMetadataSelected.getLabel()); } public FileMetadata getFileMetadataSelected() { if (fileMetadataSelected != null) { - logger.fine("returning file metadata for the advanced options popup (" + fileMetadataSelected.getLabel() + ")"); + logger.log(Level.FINE, "returning file metadata for the advanced options popup ({0})", fileMetadataSelected.getLabel()); } else { logger.fine("file metadata for the advanced options popup is null."); } @@ -2226,7 +2223,7 @@ public void saveAsDesignatedThumbnail() { } public void deleteDatasetLogoAndUseThisDataFileAsThumbnailInstead() { - logger.fine("For dataset id " + dataset.getId() + " the current thumbnail is from a dataset logo rather than a dataset file, blowing away the logo and using this FileMetadata id instead: " + fileMetadataSelectedForThumbnailPopup); + logger.log(Level.FINE, "For dataset id {0} the current thumbnail is from a dataset logo rather than a dataset file, blowing away the logo and using this FileMetadata id instead: {1}", new Object[]{dataset.getId(), fileMetadataSelectedForThumbnailPopup}); /** * @todo Rather than deleting and merging right away, try to respect how * this page seems to stage actions and giving the user a chance to diff --git a/src/main/java/edu/harvard/iq/dataverse/EjbDataverseEngine.java b/src/main/java/edu/harvard/iq/dataverse/EjbDataverseEngine.java index 411c72ac8b7..e36f7feaec3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/EjbDataverseEngine.java +++ b/src/main/java/edu/harvard/iq/dataverse/EjbDataverseEngine.java @@ -182,7 +182,7 @@ public R submit(Command aCommand) throws CommandException { DataverseRequest dvReq = aCommand.getRequest(); Map affectedDvObjects = aCommand.getAffectedDvObjects(); - logRec.setInfo( describe(affectedDvObjects) ); + logRec.setInfo(aCommand.describe()); for (Map.Entry> pair : requiredMap.entrySet()) { String dvName = pair.getKey(); if (!affectedDvObjects.containsKey(dvName)) { @@ -442,16 +442,5 @@ public DataCaptureModuleServiceBean dataCaptureModule() { return ctxt; } - - - private String describe( Map dvObjMap ) { - StringBuilder sb = new StringBuilder(); - for ( Map.Entry ent : dvObjMap.entrySet() ) { - DvObject value = ent.getValue(); - sb.append(ent.getKey()).append(":"); - sb.append( (value!=null) ? value.accept(DvObject.NameIdPrinter) : ""); - sb.append(" "); - } - return sb.toString(); - } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/FileDownloadHelper.java b/src/main/java/edu/harvard/iq/dataverse/FileDownloadHelper.java index 716d34e6310..2e9fdac9511 100644 --- a/src/main/java/edu/harvard/iq/dataverse/FileDownloadHelper.java +++ b/src/main/java/edu/harvard/iq/dataverse/FileDownloadHelper.java @@ -7,15 +7,24 @@ import edu.harvard.iq.dataverse.authorization.Permission; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; -import edu.harvard.iq.dataverse.authorization.users.GuestUser; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.util.BundleUtil; +import static edu.harvard.iq.dataverse.util.JsfHelper.JH; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; +import java.util.logging.Logger; import javax.ejb.EJB; +import javax.faces.application.FacesMessage; +import javax.faces.component.UIInput; +import javax.faces.component.UIOutput; +import javax.faces.context.FacesContext; +import javax.faces.event.AjaxBehaviorEvent; import javax.faces.view.ViewScoped; import javax.inject.Inject; import javax.inject.Named; +import org.primefaces.context.RequestContext; /** * @@ -26,9 +35,13 @@ @ViewScoped @Named public class FileDownloadHelper implements java.io.Serializable { - + + private static final Logger logger = Logger.getLogger(FileDownloadHelper.class.getCanonicalName()); @Inject DataverseSession session; + + @Inject + DataverseRequestServiceBean dvRequestService; @EJB PermissionServiceBean permissionService; @@ -36,16 +49,252 @@ public class FileDownloadHelper implements java.io.Serializable { @EJB FileDownloadServiceBean fileDownloadService; + @EJB + GuestbookResponseServiceBean guestbookResponseService; + @EJB DataFileServiceBean datafileService; + UIInput nameField; + + public UIInput getNameField() { + return nameField; + } + + public void setNameField(UIInput nameField) { + this.nameField = nameField; + } + + public UIInput getEmailField() { + return emailField; + } + + public void setEmailField(UIInput emailField) { + this.emailField = emailField; + } + + public UIInput getInstitutionField() { + return institutionField; + } + + public void setInstitutionField(UIInput institutionField) { + this.institutionField = institutionField; + } + + public UIInput getPositionField() { + return positionField; + } + + public void setPositionField(UIInput positionField) { + this.positionField = positionField; + } + UIInput emailField; + UIInput institutionField; + UIInput positionField; + + + + private final Map fileDownloadPermissionMap = new HashMap<>(); // { FileMetadata.id : Boolean } + public void nameValueChangeListener(AjaxBehaviorEvent e) { + String name= (String) ((UIOutput) e.getSource()).getValue(); + this.guestbookResponse.setName(name); + } + + public void emailValueChangeListener(AjaxBehaviorEvent e) { + String email= (String) ((UIOutput) e.getSource()).getValue(); + this.guestbookResponse.setEmail(email); + } + + public void institutionValueChangeListener(AjaxBehaviorEvent e) { + String institution= (String) ((UIOutput) e.getSource()).getValue(); + this.guestbookResponse.setInstitution(institution); + } + + public void positionValueChangeListener(AjaxBehaviorEvent e) { + String position= (String) ((UIOutput) e.getSource()).getValue(); + this.guestbookResponse.setPosition(position); + } + + public void customQuestionValueChangeListener(AjaxBehaviorEvent e) { + String questionNo = (String) ((UIOutput) e.getSource()).getId(); + String position= (String) ((UIOutput) e.getSource()).getValue(); + } + public FileDownloadHelper() { this.filesForRequestAccess = new ArrayList<>(); } + private boolean testResponseLength(String value) { + return !(value != null && value.length() > 255); + } + + private boolean validateGuestbookResponse(GuestbookResponse guestbookResponse){ + Dataset dataset = guestbookResponse.getDataset(); + boolean valid = true; + if (dataset.getGuestbook() != null) { + if (dataset.getGuestbook().isNameRequired()) { + boolean nameValid = (guestbookResponse.getName() != null && !guestbookResponse.getName().isEmpty()); + if (!nameValid) { + nameField.setValid(false); + FacesContext.getCurrentInstance().addMessage(nameField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("requiredField"), null)); + } + valid &= nameValid; + } + valid &= testResponseLength(guestbookResponse.getName()); + if (! testResponseLength(guestbookResponse.getName())){ + nameField.setValid(false); + FacesContext.getCurrentInstance().addMessage(nameField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("dataset.guestbookResponse.guestbook.responseTooLong"), null)); + } + if (dataset.getGuestbook().isEmailRequired()) { + boolean emailValid = (guestbookResponse.getEmail() != null && !guestbookResponse.getEmail().isEmpty()); + if (!emailValid) { + emailField.setValid(false); + FacesContext.getCurrentInstance().addMessage(emailField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("requiredField"), null)); + } + valid &= emailValid; + } + valid &= testResponseLength(guestbookResponse.getEmail()); + if (! testResponseLength(guestbookResponse.getEmail())){ + emailField.setValid(false); + FacesContext.getCurrentInstance().addMessage(emailField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("dataset.guestbookResponse.guestbook.responseTooLong"), null)); + } + if (dataset.getGuestbook().isInstitutionRequired()) { + boolean institutionValid = (guestbookResponse.getInstitution()!= null && !guestbookResponse.getInstitution().isEmpty()); + if (!institutionValid) { + institutionField.setValid(false); + FacesContext.getCurrentInstance().addMessage(institutionField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("requiredField"), null)); + } + valid &= institutionValid; + } + valid &= testResponseLength(guestbookResponse.getInstitution()); + if (! testResponseLength(guestbookResponse.getInstitution())){ + institutionField.setValid(false); + FacesContext.getCurrentInstance().addMessage(institutionField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("dataset.guestbookResponse.guestbook.responseTooLong"), null)); + } + if (dataset.getGuestbook().isPositionRequired()) { + boolean positionValid = (guestbookResponse.getPosition()!= null && !guestbookResponse.getPosition().isEmpty()); + if (!positionValid) { + positionField.setValid(false); + FacesContext.getCurrentInstance().addMessage(positionField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("requiredField"), null)); + } + valid &= positionValid; + } + valid &= testResponseLength(guestbookResponse.getPosition()); + if (! testResponseLength(guestbookResponse.getPosition())){ + positionField.setValid(false); + FacesContext.getCurrentInstance().addMessage(positionField.getClientId(), + new FacesMessage(FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("dataset.guestbookResponse.guestbook.responseTooLong"), null)); + } + } + + if (dataset.getGuestbook() != null && !dataset.getGuestbook().getCustomQuestions().isEmpty()) { + for (CustomQuestion cq : dataset.getGuestbook().getCustomQuestions()) { + if (cq.isRequired()) { + for (CustomQuestionResponse cqr : guestbookResponse.getCustomQuestionResponses()) { + if (cqr.getCustomQuestion().equals(cq)) { + valid &= (cqr.getResponse() != null && !cqr.getResponse().isEmpty()); + if (cqr.getResponse() == null || cqr.getResponse().isEmpty()){ + cqr.setValidationMessage(BundleUtil.getStringFromBundle("requiredField")); + } else{ + cqr.setValidationMessage(""); + } + } + } + } + } + } + + return valid; + + } + + public void writeGuestbookAndStartDownload(GuestbookResponse guestbookResponse) { + RequestContext requestContext = RequestContext.getCurrentInstance(); + boolean valid = validateGuestbookResponse(guestbookResponse); + + if (!valid) { + JH.addMessage(FacesMessage.SEVERITY_ERROR, JH.localize("dataset.message.validationError")); + } else { + requestContext.execute("PF('downloadPopup').hide()"); + guestbookResponse.setDownloadtype("Download"); + fileDownloadService.writeGuestbookAndStartDownload(guestbookResponse); + } + + } + + public void writeGuestbookAndOpenSubset(GuestbookResponse guestbookResponse) { + RequestContext requestContext = RequestContext.getCurrentInstance(); + boolean valid = validateGuestbookResponse(guestbookResponse); + + if (!valid) { + + } else { + requestContext.execute("PF('downloadPopup').hide()"); + requestContext.execute("PF('downloadDataSubsetPopup').show()"); + guestbookResponse.setDownloadtype("Subset"); + fileDownloadService.writeGuestbookResponseRecord(guestbookResponse); + } + + } + + /** + * This method is only invoked from a popup. A popup appears when the + * user might have to accept terms of use, fill in a guestbook, etc. + */ + public void writeGuestbookAndLaunchExploreTool(GuestbookResponse guestbookResponse, FileMetadata fmd, ExternalTool externalTool) { + + /** + * We need externalTool to be non-null when calling "explore" below (so + * that we can instantiate an ExternalToolHandler) so we retrieve + * externalTool from a transient variable in guestbookResponse if + * externalTool is null. The current observation is that externalTool + * is null from the dataset page and non-null from the file page. See + * file-download-button-fragment.xhtml where the popup is launched (as + * needed) and file-download-popup-fragment.xhtml for the popup itself. + * + * TODO: If we could figure out a way for externalTool to always be + * non-null, we could remove this if statement and the transient + * "externalTool" variable on guestbookResponse. + */ + if (externalTool == null) { + externalTool = guestbookResponse.getExternalTool(); + } + + RequestContext requestContext = RequestContext.getCurrentInstance(); + boolean valid = validateGuestbookResponse(guestbookResponse); + + if (!valid) { + return; + } + fileDownloadService.explore(guestbookResponse, fmd, externalTool); + requestContext.execute("PF('downloadPopup').hide()"); + } + + public String startWorldMapDownloadLink(GuestbookResponse guestbookResponse, FileMetadata fmd){ + + RequestContext requestContext = RequestContext.getCurrentInstance(); + boolean valid = validateGuestbookResponse(guestbookResponse); + + if (!valid) { + return ""; + } + guestbookResponse.setDownloadtype("WorldMap"); + String retVal = fileDownloadService.startWorldMapDownloadLink(guestbookResponse, fmd); + requestContext.execute("PF('downloadPopup').hide()"); + return retVal; + } + + private List filesForRequestAccess; public List getFilesForRequestAccess() { @@ -104,26 +353,17 @@ public boolean canDownloadFile(FileMetadata fileMetadata){ return false; } - // -------------------------------------------------------------------- - // Grab the fileMetadata.id and restriction flag - // -------------------------------------------------------------------- Long fid = fileMetadata.getId(); //logger.info("calling candownloadfile on filemetadata "+fid); + // Note that `isRestricted` at the FileMetadata level is for expressing intent by version. Enforcement is done with `isRestricted` at the DataFile level. boolean isRestrictedFile = fileMetadata.isRestricted(); - // -------------------------------------------------------------------- // Has this file been checked? Look at the DatasetPage hash - // -------------------------------------------------------------------- if (this.fileDownloadPermissionMap.containsKey(fid)){ // Yes, return previous answer //logger.info("using cached result for candownloadfile on filemetadata "+fid); return this.fileDownloadPermissionMap.get(fid); } - //---------------------------------------------------------------------- - //(0) Before we do any testing - if version is deaccessioned and user - // does not have edit dataset permission then may download - //---------------------------------------------------------------------- - if (fileMetadata.getDatasetVersion().isDeaccessioned()) { if (this.doesSessionUserHavePermission(Permission.EditDataset, fileMetadata)) { // Yes, save answer and return true @@ -135,66 +375,20 @@ public boolean canDownloadFile(FileMetadata fileMetadata){ } } - // -------------------------------------------------------------------- - // (1) Is the file Unrestricted ? - // -------------------------------------------------------------------- if (!isRestrictedFile){ // Yes, save answer and return true this.fileDownloadPermissionMap.put(fid, true); return true; } - // -------------------------------------------------------------------- - // Conditions (2) through (4) are for Restricted files - // -------------------------------------------------------------------- - - // -------------------------------------------------------------------- - // (2) In Dataverse 4.3 and earlier we required that users be authenticated - // to download files, but in developing the Private URL feature, we have - // added a new subclass of "User" called "PrivateUrlUser" that returns false - // for isAuthenticated but that should be able to download restricted files - // when given the Member role (which includes the DownloadFile permission). - // This is consistent with how Builtin and Shib users (both are - // AuthenticatedUsers) can download restricted files when they are granted - // the Member role. For this reason condition 2 has been changed. Previously, - // we required isSessionUserAuthenticated to return true. Now we require - // that the User is not an instance of GuestUser, which is similar in - // spirit to the previous check. - // -------------------------------------------------------------------- - - if (session.getUser() instanceof GuestUser){ - this.fileDownloadPermissionMap.put(fid, false); - return false; - } - - - // -------------------------------------------------------------------- - // (3) Does the User have DownloadFile Permission at the **Dataset** level - // -------------------------------------------------------------------- - - - if (this.doesSessionUserHavePermission(Permission.DownloadFile, fileMetadata)){ - // Yes, save answer and return true + // See if the DataverseRequest, which contains IP Groups, has permission to download the file. + if (permissionService.requestOn(dvRequestService.getDataverseRequest(), fileMetadata.getDataFile()).has(Permission.DownloadFile)) { + logger.fine("The DataverseRequest (User plus IP address) has access to download the file."); this.fileDownloadPermissionMap.put(fid, true); return true; } - - // -------------------------------------------------------------------- - // (4) Does the user has DownloadFile permission on the DataFile - // -------------------------------------------------------------------- - /* - if (this.permissionService.on(fileMetadata.getDataFile()).has(Permission.DownloadFile)){ - this.fileDownloadPermissionMap.put(fid, true); - return true; - } - */ - - // -------------------------------------------------------------------- - // (6) No download.... - // -------------------------------------------------------------------- this.fileDownloadPermissionMap.put(fid, false); - return false; } @@ -264,7 +458,20 @@ private void processRequestAccess(DataFile file, Boolean sendNotification) { } } } + + private GuestbookResponse guestbookResponse; + + public GuestbookResponse getGuestbookResponse() { + return guestbookResponse; + } + + public void setGuestbookResponse(GuestbookResponse guestbookResponse) { + this.guestbookResponse = guestbookResponse; + } + public GuestbookResponseServiceBean getGuestbookResponseService(){ + return this.guestbookResponseService; + } //todo: potential cleanup - are these methods needed? diff --git a/src/main/java/edu/harvard/iq/dataverse/FileDownloadServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/FileDownloadServiceBean.java index 0dd4bd3b4be..29deb5fe9b9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/FileDownloadServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/FileDownloadServiceBean.java @@ -1,14 +1,16 @@ package edu.harvard.iq.dataverse; +import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; import edu.harvard.iq.dataverse.authorization.Permission; +import edu.harvard.iq.dataverse.authorization.users.ApiToken; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; -import edu.harvard.iq.dataverse.dataaccess.SwiftAccessIO; -import edu.harvard.iq.dataverse.datasetutility.TwoRavensHelper; +import edu.harvard.iq.dataverse.authorization.users.User; import edu.harvard.iq.dataverse.datasetutility.WorldMapPermissionHelper; -import edu.harvard.iq.dataverse.engine.command.Command; import edu.harvard.iq.dataverse.engine.command.exception.CommandException; import edu.harvard.iq.dataverse.engine.command.impl.CreateGuestbookResponseCommand; import edu.harvard.iq.dataverse.engine.command.impl.RequestAccessCommand; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.externaltools.ExternalToolHandler; import edu.harvard.iq.dataverse.util.FileUtil; import java.io.IOException; import java.sql.Timestamp; @@ -19,7 +21,6 @@ import java.util.logging.Logger; import javax.ejb.EJB; import javax.ejb.Stateless; -import javax.faces.context.ExternalContext; import javax.faces.context.FacesContext; import javax.inject.Inject; import javax.inject.Named; @@ -27,7 +28,6 @@ import javax.persistence.PersistenceContext; import javax.servlet.ServletOutputStream; import javax.servlet.http.HttpServletResponse; -import org.primefaces.context.RequestContext; /** * @@ -56,6 +56,8 @@ public class FileDownloadServiceBean implements java.io.Serializable { DataverseServiceBean dataverseService; @EJB UserNotificationServiceBean userNotificationService; + @EJB + AuthenticationServiceBean authService; @Inject DataverseSession session; @@ -66,14 +68,14 @@ public class FileDownloadServiceBean implements java.io.Serializable { @Inject DataverseRequestServiceBean dvRequestService; - @Inject TwoRavensHelper twoRavensHelper; @Inject WorldMapPermissionHelper worldMapPermissionHelper; @Inject FileDownloadHelper fileDownloadHelper; - private static final Logger logger = Logger.getLogger(FileDownloadServiceBean.class.getCanonicalName()); + private static final Logger logger = Logger.getLogger(FileDownloadServiceBean.class.getCanonicalName()); public void writeGuestbookAndStartDownload(GuestbookResponse guestbookResponse){ + if (guestbookResponse != null && guestbookResponse.getDataFile() != null ){ writeGuestbookResponseRecord(guestbookResponse); callDownloadServlet(guestbookResponse.getFileFormat(), guestbookResponse.getDataFile().getId(), guestbookResponse.isWriteResponse()); @@ -145,38 +147,49 @@ public void startFileDownload(GuestbookResponse guestbookResponse, FileMetadata callDownloadServlet(format, fileMetadata.getDataFile().getId(), recordsWritten); logger.fine("issued file download redirect for filemetadata "+fileMetadata.getId()+", datafile "+fileMetadata.getDataFile().getId()); } - - - public String startExploreDownloadLink(GuestbookResponse guestbookResponse, FileMetadata fmd){ - if (guestbookResponse != null && guestbookResponse.isWriteResponse() - && (( fmd != null && fmd.getDataFile() != null) || guestbookResponse.getDataFile() != null)){ - if(guestbookResponse.getDataFile() == null && fmd != null){ - guestbookResponse.setDataFile(fmd.getDataFile()); - } - if (fmd == null || !fmd.getDatasetVersion().isDraft()){ - writeGuestbookResponseRecord(guestbookResponse); - } + /** + * Launch an "explore" tool which is a type of ExternalTool such as + * TwoRavens or Data Explorer. This method may be invoked directly from the + * xhtml if no popup is required (no terms of use, no guestbook, etc.). + */ + public void explore(GuestbookResponse guestbookResponse, FileMetadata fmd, ExternalTool externalTool) { + ApiToken apiToken = null; + User user = session.getUser(); + if (user instanceof AuthenticatedUser) { + AuthenticatedUser authenticatedUser = (AuthenticatedUser) user; + apiToken = authService.findApiTokenByUser(authenticatedUser); } - - Long datafileId; - - if (fmd == null && guestbookResponse != null && guestbookResponse.getDataFile() != null){ - datafileId = guestbookResponse.getDataFile().getId(); + DataFile dataFile = null; + if (fmd != null) { + dataFile = fmd.getDataFile(); } else { - datafileId = fmd.getDataFile().getId(); + if (guestbookResponse != null) { + dataFile = guestbookResponse.getDataFile(); + } } - String retVal = twoRavensHelper.getDataExploreURLComplete(datafileId); - + ExternalToolHandler externalToolHandler = new ExternalToolHandler(externalTool, dataFile, apiToken); + // Back when we only had TwoRavens, the downloadType was always "Explore". Now we persist the name of the tool (i.e. "TwoRavens", "Data Explorer", etc.) + guestbookResponse.setDownloadtype(externalTool.getDisplayName()); + String toolUrl = externalToolHandler.getToolUrlWithQueryParams(); + logger.fine("Exploring with " + toolUrl); try { - FacesContext.getCurrentInstance().getExternalContext().redirect(retVal); - return retVal; + FacesContext.getCurrentInstance().getExternalContext().redirect(toolUrl); } catch (IOException ex) { - logger.info("Failed to issue a redirect to file download url."); + logger.info("Problem exploring with " + toolUrl + " - " + ex); + } + // This is the old logic from TwoRavens, null checks and all. + if (guestbookResponse != null && guestbookResponse.isWriteResponse() + && ((fmd != null && fmd.getDataFile() != null) || guestbookResponse.getDataFile() != null)) { + if (guestbookResponse.getDataFile() == null && fmd != null) { + guestbookResponse.setDataFile(fmd.getDataFile()); + } + if (fmd == null || !fmd.getDatasetVersion().isDraft()) { + writeGuestbookResponseRecord(guestbookResponse); + } } - return retVal; } - + public String startWorldMapDownloadLink(GuestbookResponse guestbookResponse, FileMetadata fmd){ if (guestbookResponse != null && guestbookResponse.isWriteResponse() && ((fmd != null && fmd.getDataFile() != null) || guestbookResponse.getDataFile() != null)){ diff --git a/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java b/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java index 6d74ca456d4..62f884e01da 100644 --- a/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java +++ b/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java @@ -64,6 +64,13 @@ public class FileMetadata implements Serializable { @Column(columnDefinition = "TEXT") private String description = ""; + /** + * At the FileMetadata level, "restricted" is a historical indication of the + * data owner's intent for the file by version. Permissions are actually + * enforced based on the "restricted" boolean at the *DataFile* level. On + * publish, the latest intent is copied from the FileMetadata level to the + * DataFile level. + */ @Expose private boolean restricted; diff --git a/src/main/java/edu/harvard/iq/dataverse/FilePage.java b/src/main/java/edu/harvard/iq/dataverse/FilePage.java index 43b5ac99396..2578d25f503 100644 --- a/src/main/java/edu/harvard/iq/dataverse/FilePage.java +++ b/src/main/java/edu/harvard/iq/dataverse/FilePage.java @@ -10,15 +10,18 @@ import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; import edu.harvard.iq.dataverse.authorization.Permission; import edu.harvard.iq.dataverse.dataaccess.StorageIO; -import edu.harvard.iq.dataverse.datasetutility.TwoRavensHelper; import edu.harvard.iq.dataverse.datasetutility.WorldMapPermissionHelper; import edu.harvard.iq.dataverse.engine.command.Command; import edu.harvard.iq.dataverse.engine.command.exception.CommandException; +import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; +import edu.harvard.iq.dataverse.engine.command.impl.CreateDatasetCommand; import edu.harvard.iq.dataverse.engine.command.impl.RestrictFileCommand; import edu.harvard.iq.dataverse.engine.command.impl.UpdateDatasetCommand; import edu.harvard.iq.dataverse.export.ExportException; import edu.harvard.iq.dataverse.export.ExportService; import edu.harvard.iq.dataverse.export.spi.Exporter; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.externaltools.ExternalToolServiceBean; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.FileUtil; import edu.harvard.iq.dataverse.util.JsfHelper; @@ -27,9 +30,7 @@ import java.io.IOException; import java.util.ArrayList; import java.util.List; -import java.util.ResourceBundle; import java.util.Set; -import java.util.logging.Level; import java.util.logging.Logger; import javax.ejb.EJB; import javax.ejb.EJBException; @@ -62,7 +63,8 @@ public class FilePage implements java.io.Serializable { private Dataset dataset; private List datasetVersionsForTab; private List fileMetadatasForTab; - + private List configureTools; + private List exploreTools; @EJB DataFileServiceBean datafileService; @@ -89,6 +91,8 @@ public class FilePage implements java.io.Serializable { DataverseSession session; @EJB EjbDataverseEngine commandEngine; + @EJB + ExternalToolServiceBean externalToolService; @Inject DataverseRequestServiceBean dvRequestService; @@ -96,8 +100,6 @@ public class FilePage implements java.io.Serializable { PermissionsWrapper permissionsWrapper; @Inject FileDownloadHelper fileDownloadHelper; - @Inject - TwoRavensHelper twoRavensHelper; @Inject WorldMapPermissionHelper worldMapPermissionHelper; public WorldMapPermissionHelper getWorldMapPermissionHelper() { @@ -155,6 +157,13 @@ public String init() { this.guestbookResponse = this.guestbookResponseService.initGuestbookResponseForFragment(fileMetadata, session); + // this.getFileDownloadHelper().setGuestbookResponse(guestbookResponse); + + if (file.isTabularData()) { + configureTools = externalToolService.findByType(ExternalTool.Type.CONFIGURE); + exploreTools = externalToolService.findByType(ExternalTool.Type.EXPLORE); + } + } else { return permissionsWrapper.notFound(); @@ -502,17 +511,28 @@ public String save() { return ""; } + private Boolean thumbnailAvailable = null; + public boolean isThumbnailAvailable(FileMetadata fileMetadata) { // new and optimized logic: // - check download permission here (should be cached - so it's free!) // - only then ask the file service if the thumbnail is available/exists. - // the service itself no longer checks download permissions. + // the service itself no longer checks download permissions. + // (Also, cache the result the first time the check is performed... + // remember - methods referenced in "rendered=..." attributes are + // called *multiple* times as the page is loading!) + if (thumbnailAvailable != null) { + return thumbnailAvailable; + } + if (!fileDownloadHelper.canDownloadFile(fileMetadata)) { - return false; + thumbnailAvailable = false; + } else { + thumbnailAvailable = datafileService.isThumbnailAvailable(fileMetadata.getDataFile()); } - - return datafileService.isThumbnailAvailable(fileMetadata.getDataFile()); + + return thumbnailAvailable; } private String returnToDatasetOnly(){ @@ -638,16 +658,20 @@ public boolean isDraftReplacementFile(){ Since it must must work when you are on prior versions of the dataset it must accrue all replacement files that may have been created */ - Dataset datasetToTest = fileMetadata.getDataFile().getOwner(); - DataFile dataFileToTest = fileMetadata.getDataFile(); + if(null == dataset) { + dataset = fileMetadata.getDataFile().getOwner(); + } - DatasetVersion currentVersion = datasetToTest.getLatestVersion(); + //MAD: Can we use the file variable already existing? + DataFile dataFileToTest = fileMetadata.getDataFile(); + + DatasetVersion currentVersion = dataset.getLatestVersion(); if (!currentVersion.isDraft()){ return false; } - if (datasetToTest.getReleasedVersion() == null){ + if (dataset.getReleasedVersion() == null){ return false; } @@ -668,7 +692,7 @@ public boolean isDraftReplacementFile(){ DataFile current = dataFiles.get(numFiles - 1 ); - DatasetVersion publishedVersion = datasetToTest.getReleasedVersion(); + DatasetVersion publishedVersion = dataset.getReleasedVersion(); if( datafileService.findFileMetadataByDatasetVersionIdAndDataFileId(publishedVersion.getId(), current.getId()) == null){ return true; @@ -692,10 +716,48 @@ public boolean isReplacementFile(){ public boolean isPubliclyDownloadable() { return FileUtil.isPubliclyDownloadable(fileMetadata); } + + private Boolean lockedFromEditsVar; + private Boolean lockedFromDownloadVar; + + /** + * Authors are not allowed to edit but curators are allowed - when Dataset is inReview + * For all other locks edit should be locked for all editors. + */ + public boolean isLockedFromEdits() { + if(null == dataset) { + dataset = fileMetadata.getDataFile().getOwner(); + } + + if(null == lockedFromEditsVar) { + try { + permissionService.checkEditDatasetLock(dataset, dvRequestService.getDataverseRequest(), new UpdateDatasetCommand(dataset, dvRequestService.getDataverseRequest())); + lockedFromEditsVar = false; + } catch (IllegalCommandException ex) { + lockedFromEditsVar = true; + } + } + return lockedFromEditsVar; + } + + public boolean isLockedFromDownload(){ + if(null == dataset) { + dataset = fileMetadata.getDataFile().getOwner(); + } + if (null == lockedFromDownloadVar) { + try { + permissionService.checkDownloadFileLock(dataset, dvRequestService.getDataverseRequest(), new CreateDatasetCommand(dataset, dvRequestService.getDataverseRequest())); + lockedFromDownloadVar = false; + } catch (IllegalCommandException ex) { + lockedFromDownloadVar = true; + } + } + return lockedFromDownloadVar; + } public String getPublicDownloadUrl() { - try { - StorageIO storageIO = getFile().getStorageIO(); + try { + StorageIO storageIO = getFile().getStorageIO(); if (storageIO instanceof SwiftAccessIO) { String fileDownloadUrl = null; try { @@ -723,4 +785,12 @@ public String getPublicDownloadUrl() { return FileUtil.getPublicDownloadUrl(systemConfig.getDataverseSiteUrl(), fileId); } + public List getConfigureTools() { + return configureTools; + } + + public List getExploreTools() { + return exploreTools; + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/GlobalId.java b/src/main/java/edu/harvard/iq/dataverse/GlobalId.java index 3761245f583..519c70a5628 100644 --- a/src/main/java/edu/harvard/iq/dataverse/GlobalId.java +++ b/src/main/java/edu/harvard/iq/dataverse/GlobalId.java @@ -22,8 +22,8 @@ public class GlobalId implements java.io.Serializable { public static final String DOI_PROTOCOL = "doi"; public static final String HDL_PROTOCOL = "hdl"; - public static final String HDL_RESOLVER_URL = "http://hdl.handle.net/"; - public static final String DOI_RESOLVER_URL = "http://dx.doi.org/"; + public static final String HDL_RESOLVER_URL = "https://hdl.handle.net/"; + public static final String DOI_RESOLVER_URL = "https://doi.org/"; @EJB SettingsServiceBean settingsService; diff --git a/src/main/java/edu/harvard/iq/dataverse/GlobalId.java.sekmiller b/src/main/java/edu/harvard/iq/dataverse/GlobalId.java.sekmiller deleted file mode 100644 index 042685c5d49..00000000000 --- a/src/main/java/edu/harvard/iq/dataverse/GlobalId.java.sekmiller +++ /dev/null @@ -1,157 +0,0 @@ -/* - * To change this license header, choose License Headers in Project Properties. - * To change this template file, choose Tools | Templates - * and open the template in the editor. - */ - -package edu.harvard.iq.dataverse; - -import edu.harvard.iq.dataverse.settings.SettingsServiceBean; -import java.net.MalformedURLException; -import java.util.logging.Level; -import java.util.logging.Logger; -import java.net.URL; -import javax.ejb.EJB; - -/** - * - * @author skraffmiller - */ -public class GlobalId implements java.io.Serializable { - - @EJB - SettingsServiceBean settingsService; - - public GlobalId(String identifier) { - - // set the protocol, authority, and identifier via parsePersistentId - if (!this.parsePersistentId(identifier)){ - throw new IllegalArgumentException("Failed to parse identifier: " + identifier); - } - } - - public GlobalId(String protocol, String authority, String identifier) { - this.protocol = protocol; - this.authority = authority; - this.identifier = identifier; - } - - public GlobalId(Dataset dataset){ - this.authority = dataset.getAuthority(); - this.protocol = dataset.getProtocol(); - this.identifier = dataset.getIdentifier(); - } - - private String protocol; - private String authority; - private String identifier; - - public String getProtocol() { - return protocol; - } - - public void setProtocol(String protocol) { - this.protocol = protocol; - } - - public String getAuthority() { - return authority; - } - - public void setAuthority(String authority) { - this.authority = authority; - } - - public String getIdentifier() { - return identifier; - } - - public void setIdentifier(String identifier) { - this.identifier = identifier; - } - - public String toString() { - return protocol + ":" + authority + "/" + identifier; - } - - public URL toURL() { - URL url = null; - try { - if (protocol.equals("doi")){ - url = new URL("http://dx.doi.org/" + authority + "/" + identifier); - } else { - url = new URL("http://hdl.handle.net/" + authority + "/" + identifier); - } - } catch (MalformedURLException ex) { - Logger.getLogger(GlobalId.class.getName()).log(Level.SEVERE, null, ex); - } - return url; - } - - - /** - * Parse a Persistent Id and set the protocol, authority, and identifier - * - * Example 1: doi:10.5072/FK2/BYM3IW - * protocol: doi - * authority: 10.5072/FK2 - * identifier: BYM3IW - * - * Example 2: hdl:1902.1/111012 - * protocol: hdl - * authority: 1902.1 - * identifier: 111012 - * - * @param persistentId - * - */ - private boolean parsePersistentId(String persistentId){ - - if (persistentId==null){ - return false; - } - - String doiSeparator = "/";//settingsService.getValueForKey(SettingsServiceBean.Key.DoiSeparator, "/"); - - // Looking for this split - // doi:10.5072/FK2/BYM3IW => (doi) (10.5072/FK2/BYM3IW) - // - // or this one: (hdl) (1902.1/xxxxx) - // - String[] items = persistentId.split(":"); - if (items.length != 2){ - return false; - } - String protocolPiece = items[0].toLowerCase(); - - String[] pieces = items[1].split(doiSeparator); - - // ----------------------------- - // Is this a handle? - // ----------------------------- - if ( pieces.length == 2 && protocolPiece.equals("hdl")){ - // example: hdl:1902.1/111012 - - this.protocol = protocolPiece; // hdl - this.authority = pieces[0]; // 1902.1 - this.identifier = pieces[1]; // 111012 - return true; - - }else if (pieces.length == 3 && protocolPiece.equals("doi")){ - // ----------------------------- - // Is this a DOI? - // ----------------------------- - // example: doi:10.5072/FK2/BYM3IW - - this.protocol = protocolPiece; // doi - this.authority = pieces[0] + doiSeparator + pieces[1]; // "10.5072/FK2" - this.identifier = pieces[2]; // "BYM3IW" - return true; - } - - return false; - } - - - -} diff --git a/src/main/java/edu/harvard/iq/dataverse/Guestbook.java b/src/main/java/edu/harvard/iq/dataverse/Guestbook.java index fcfd35dcd4d..61e621b7090 100644 --- a/src/main/java/edu/harvard/iq/dataverse/Guestbook.java +++ b/src/main/java/edu/harvard/iq/dataverse/Guestbook.java @@ -14,6 +14,7 @@ import javax.persistence.JoinColumn; import javax.persistence.OneToMany; import java.util.List; +import java.util.Objects; import javax.persistence.Column; import javax.persistence.ManyToOne; import javax.persistence.OrderBy; @@ -297,5 +298,14 @@ public void setResponseCountDataverse(Long responseCountDataverse) { this.responseCountDataverse = responseCountDataverse; } + @Override + public boolean equals(Object object) { + if (!(object instanceof Guestbook)) { + return false; + } + Guestbook other = (Guestbook) object; + return Objects.equals(getId(), other.getId()); + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/GuestbookResponse.java b/src/main/java/edu/harvard/iq/dataverse/GuestbookResponse.java index a3790fd32ce..fafb1fc5b9b 100644 --- a/src/main/java/edu/harvard/iq/dataverse/GuestbookResponse.java +++ b/src/main/java/edu/harvard/iq/dataverse/GuestbookResponse.java @@ -6,6 +6,7 @@ package edu.harvard.iq.dataverse; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; import java.io.Serializable; import java.text.SimpleDateFormat; import java.util.ArrayList; @@ -58,6 +59,16 @@ public class GuestbookResponse implements Serializable { private String email; private String institution; private String position; + /** + * Possible values for downloadType include "Download", "Subset", + * "WorldMap", or the displayName of an ExternalTool. + * + * TODO: Types like "Download" and "Subset" and probably "WorldMap" should + * be defined once as constants (likely an enum) rather than having these + * strings duplicated in various places when setDownloadtype() is called. + * (Some day it would be nice to convert WorldMap into an ExternalTool but + * it's not worth the effort at this time.) + */ private String downloadtype; private String sessionId; @@ -81,6 +92,14 @@ public class GuestbookResponse implements Serializable { @Transient private boolean writeResponse = true; + /** + * This transient variable is a place to temporarily retrieve the + * ExternalTool object from the popup when the popup is required on the + * dataset page. TODO: Some day, investigate if it can be removed. + */ + @Transient + private ExternalTool externalTool; + public boolean isWriteResponse() { return writeResponse; } @@ -106,6 +125,14 @@ public void setFileFormat(String downloadFormat) { this.fileFormat = downloadFormat; } + public ExternalTool getExternalTool() { + return externalTool; + } + + public void setExternalTool(ExternalTool externalTool) { + this.externalTool = externalTool; + } + public GuestbookResponse(){ } diff --git a/src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java index 50fe6f9632f..d29f61a7b59 100644 --- a/src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java @@ -7,7 +7,8 @@ import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import edu.harvard.iq.dataverse.authorization.users.User; -import static edu.harvard.iq.dataverse.util.JsfHelper.JH; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.util.BundleUtil; import java.io.IOException; import java.io.OutputStream; import java.text.SimpleDateFormat; @@ -23,8 +24,6 @@ import javax.ejb.TransactionAttribute; import javax.ejb.TransactionAttributeType; import javax.faces.application.FacesMessage; -import javax.faces.component.EditableValueHolder; -import javax.faces.component.UIComponent; import javax.faces.component.UIInput; import javax.faces.context.FacesContext; import javax.faces.model.SelectItem; @@ -282,7 +281,7 @@ public List findArrayByGuestbookIdAndDataverseId (Long guestbookId, Lo if (guestbookResponseId != null) { singleResult[0] = result[1]; if (result[2] != null) { - singleResult[1] = new SimpleDateFormat("MMMM d, yyyy").format((Date) result[2]); + singleResult[1] = new SimpleDateFormat("yyyy-MM-dd").format((Date) result[2]); } else { singleResult[1] = "N/A"; } @@ -688,6 +687,19 @@ public GuestbookResponse initGuestbookResponse(FileMetadata fileMetadata, String guestbookResponse.setDownloadtype("Subset"); } if(downloadFormat.toLowerCase().equals("explore")){ + /** + * TODO: Investigate this "if downloadFormat=explore" and think + * about deleting it. When is downloadFormat "explore"? When is this + * method called? Previously we were passing "explore" to + * modifyDatafileAndFormat for TwoRavens but now we pass + * "externalTool" for all external tools, including TwoRavens. When + * clicking "Explore" and then the name of the tool, we want the + * name of the exploration tool (i.e. "TwoRavens", "Data Explorer", + * etc.) to be persisted as the downloadType. We execute + * guestbookResponse.setDownloadtype(externalTool.getDisplayName()) + * over in the "explore" method of FileDownloadServiceBean just + * before the guestbookResponse is written. + */ guestbookResponse.setDownloadtype("Explore"); } guestbookResponse.setDataset(dataset); @@ -710,6 +722,32 @@ private void initCustomQuestions(GuestbookResponse guestbookResponse, Dataset da } } + private void setUserDefaultResponses(GuestbookResponse guestbookResponse, DataverseSession session, User userIn) { + User user; + User sessionUser = session.getUser(); + + if (userIn != null){ + user = userIn; + } else{ + user = sessionUser; + } + + if (user != null) { + guestbookResponse.setEmail(getUserEMail(user)); + guestbookResponse.setName(getUserName(user)); + guestbookResponse.setInstitution(getUserInstitution(user)); + guestbookResponse.setPosition(getUserPosition(user)); + guestbookResponse.setAuthenticatedUser(getAuthenticatedUser(user)); + } else { + guestbookResponse.setEmail(""); + guestbookResponse.setName(""); + guestbookResponse.setInstitution(""); + guestbookResponse.setPosition(""); + guestbookResponse.setAuthenticatedUser(null); + } + guestbookResponse.setSessionId(session.toString()); + } + private void setUserDefaultResponses(GuestbookResponse guestbookResponse, DataverseSession session) { User user = session.getUser(); if (user != null) { @@ -745,14 +783,38 @@ public GuestbookResponse initDefaultGuestbookResponse(Dataset dataset, DataFile return guestbookResponse; } - public void guestbookResponseValidator(FacesContext context, UIComponent toValidate, Object value) { - String response = (String) value; + public GuestbookResponse initAPIGuestbookResponse(Dataset dataset, DataFile dataFile, DataverseSession session, User user) { + GuestbookResponse guestbookResponse = new GuestbookResponse(); + Guestbook datasetGuestbook = dataset.getGuestbook(); + + if(datasetGuestbook == null){ + guestbookResponse.setGuestbook(findDefaultGuestbook()); + } else { + guestbookResponse.setGuestbook(datasetGuestbook); + } - if (response != null && response.length() > 255) { - ((UIInput) toValidate).setValid(false); - FacesMessage message = new FacesMessage(FacesMessage.SEVERITY_ERROR, JH.localize("dataset.guestbookResponse.guestbook.responseTooLong"), null); - context.addMessage(toValidate.getClientId(context), message); + if(dataset.getLatestVersion() != null && dataset.getLatestVersion().isDraft()){ + guestbookResponse.setWriteResponse(false); } + if (dataFile != null){ + guestbookResponse.setDataFile(dataFile); + } + guestbookResponse.setDataset(dataset); + guestbookResponse.setResponseTime(new Date()); + guestbookResponse.setSessionId(session.toString()); + guestbookResponse.setDownloadtype("Download"); + setUserDefaultResponses(guestbookResponse, session, user); + return guestbookResponse; + } + + public boolean guestbookResponseValidator( UIInput toValidate, String value) { + if (value != null && value.length() > 255) { + (toValidate).setValid(false); + FacesContext.getCurrentInstance().addMessage((toValidate).getClientId(), + new FacesMessage( FacesMessage.SEVERITY_ERROR, BundleUtil.getStringFromBundle("dataset.guestbookResponse.guestbook.responseTooLong"), null)); + return false; + } + return true; } public GuestbookResponse modifyDatafile(GuestbookResponse in, FileMetadata fm) { @@ -784,7 +846,21 @@ public GuestbookResponse modifyDatafileAndFormat(GuestbookResponse in, FileMetad return in; } + /** + * This method was added because on the dataset page when a popup is + * required, ExternalTool is null in the poup itself. We store ExternalTool + * in the GuestbookResponse as a transient variable so we have access to it + * later in the popup. + */ + public GuestbookResponse modifyDatafileAndFormat(GuestbookResponse in, FileMetadata fm, String format, ExternalTool externalTool) { + if (in != null && externalTool != null) { + in.setExternalTool(externalTool); + } + return modifyDatafileAndFormat(in, fm, format); + } + public Boolean validateGuestbookResponse(GuestbookResponse guestbookResponse, String type) { + boolean valid = true; Dataset dataset = guestbookResponse.getDataset(); if (dataset.getGuestbook() != null) { @@ -829,7 +905,7 @@ public Boolean validateGuestbookResponse(GuestbookResponse guestbookResponse, St } } } - + return valid; } diff --git a/src/main/java/edu/harvard/iq/dataverse/GuestbookServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/GuestbookServiceBean.java index 11ec129b44e..5394ddc652a 100644 --- a/src/main/java/edu/harvard/iq/dataverse/GuestbookServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/GuestbookServiceBean.java @@ -39,8 +39,18 @@ public Long findCountUsages(Long guestbookId, Long dataverseId) { } } + public Long findCountResponsesForGivenDataset(Long guestbookId, Long datasetId) { + String queryString = ""; + if (guestbookId != null && datasetId != null) { + queryString = "select count(*) from guestbookresponse where guestbook_id = " + guestbookId + " and dataset_id = " + datasetId + ";"; + Query query = em.createNativeQuery(queryString); + return (Long) query.getSingleResult(); + } else { + return new Long(0); + } + } - + public Guestbook find(Object pk) { return em.find(Guestbook.class, pk); } diff --git a/src/main/java/edu/harvard/iq/dataverse/HarvestingClientsPage.java b/src/main/java/edu/harvard/iq/dataverse/HarvestingClientsPage.java index 545c42b4495..ca7e7d7d472 100644 --- a/src/main/java/edu/harvard/iq/dataverse/HarvestingClientsPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/HarvestingClientsPage.java @@ -255,7 +255,7 @@ public void editClient(HarvestingClient harvestingClient) { this.harvestingScheduleRadio = harvestingScheduleRadioDaily; setHourOfDayAMPMfromInteger(harvestingClient.getScheduleHourOfDay()); - } else if (HarvestingClient.SCHEDULE_PERIOD_DAILY.equals(harvestingClient.getSchedulePeriod())) { + } else if (HarvestingClient.SCHEDULE_PERIOD_WEEKLY.equals(harvestingClient.getSchedulePeriod())) { this.harvestingScheduleRadio = harvestingScheduleRadioWeekly; setHourOfDayAMPMfromInteger(harvestingClient.getScheduleHourOfDay()); setWeekdayFromInteger(harvestingClient.getScheduleDayOfWeek()); @@ -933,7 +933,7 @@ private List getWeekDays() { private Integer getWeekDayNumber (String weekDayName) { List weekDays = getWeekDays(); - int i = 1; + int i = 0; for (String weekDayString: weekDays) { if (weekDayString.equals(weekDayName)) { return new Integer(i); @@ -948,8 +948,8 @@ private Integer getWeekDayNumber() { } private void setWeekdayFromInteger(Integer weekday) { - if (weekday == null || weekday.intValue() < 1 || weekday.intValue() > 7) { - weekday = 1; + if (weekday == null || weekday < 1 || weekday > 7) { + weekday = 0; //set default to Sunday } this.newHarvestingScheduleDayOfWeek = getWeekDays().get(weekday); } diff --git a/src/main/java/edu/harvard/iq/dataverse/LoginPage.java b/src/main/java/edu/harvard/iq/dataverse/LoginPage.java index 99a1af7571a..bdff28a4272 100644 --- a/src/main/java/edu/harvard/iq/dataverse/LoginPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/LoginPage.java @@ -21,12 +21,15 @@ import java.util.Iterator; import java.util.LinkedList; import java.util.List; +import java.util.Random; import java.util.logging.Level; import java.util.logging.Logger; import javax.ejb.EJB; import javax.faces.application.FacesMessage; +import javax.faces.component.UIComponent; import javax.faces.context.FacesContext; import javax.faces.event.AjaxBehaviorEvent; +import javax.faces.validator.ValidatorException; import javax.faces.view.ViewScoped; import javax.inject.Inject; import javax.inject.Named; @@ -101,6 +104,11 @@ public enum EditMode {LOGIN, SUCCESS, FAILED}; private String redirectPage = "dataverse.xhtml"; private AuthenticationProvider authProvider; + private int numFailedLoginAttempts; + Random random; + long op1; + long op2; + Long userSum; public void init() { Iterator credentialsIterator = authSvc.getAuthenticationProviderIdsOfType( CredentialsAuthenticationProvider.class ).iterator(); @@ -109,6 +117,7 @@ public void init() { } resetFilledCredentials(null); authProvider = authSvc.getAuthenticationProvider(systemConfig.getDefaultAuthProvider()); + random = new Random(); } public List listCredentialsAuthenticationProviders() { @@ -159,19 +168,19 @@ public String login() { } authReq.setIpAddress( dvRequestService.getDataverseRequest().getSourceAddress() ); try { - AuthenticatedUser r = authSvc.authenticate(credentialsAuthProviderId, authReq); + AuthenticatedUser r = authSvc.getCreateAuthenticatedUser(credentialsAuthProviderId, authReq); logger.log(Level.FINE, "User authenticated: {0}", r.getEmail()); session.setUser(r); if ("dataverse.xhtml".equals(redirectPage)) { - redirectPage = redirectPage + "&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = redirectToRoot(); } try { redirectPage = URLDecoder.decode(redirectPage, "UTF-8"); } catch (UnsupportedEncodingException ex) { Logger.getLogger(LoginPage.class.getName()).log(Level.SEVERE, null, ex); - redirectPage = "dataverse.xhtml&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = redirectToRoot(); } logger.log(Level.FINE, "Sending user to = {0}", redirectPage); @@ -179,6 +188,9 @@ public String login() { } catch (AuthenticationFailedException ex) { + numFailedLoginAttempts++; + op1 = new Long(random.nextInt(10)); + op2 = new Long(random.nextInt(10)); AuthenticationResponse response = ex.getResponse(); switch ( response.getStatus() ) { case FAIL: @@ -202,6 +214,10 @@ public String login() { } } + + private String redirectToRoot(){ + return "dataverse.xhtml?alias=" + dataverseService.findRootDataverse().getAlias(); + } public String getCredentialsAuthProviderId() { return credentialsAuthProviderId; @@ -251,9 +267,52 @@ public void setAuthProviderById(String authProviderId) { public String getLoginButtonText() { if (authProvider != null) { + // Note that for ORCID we do not want the normal "Log In with..." text. There is special logic in the xhtml. return BundleUtil.getStringFromBundle("login.button", Arrays.asList(authProvider.getInfo().getTitle())); } else { return BundleUtil.getStringFromBundle("login.button", Arrays.asList("???")); } } + + public int getNumFailedLoginAttempts() { + return numFailedLoginAttempts; + } + + public boolean isRequireExtraValidation() { + if (numFailedLoginAttempts > 2) { + return true; + } else { + return false; + } + } + + public long getOp1() { + return op1; + } + + public long getOp2() { + return op2; + } + + public Long getUserSum() { + return userSum; + } + + public void setUserSum(Long userSum) { + this.userSum = userSum; + } + + // TODO: Consolidate with SendFeedbackDialog.validateUserSum? + public void validateUserSum(FacesContext context, UIComponent component, Object value) throws ValidatorException { + // The FacesMessage text is on the xhtml side. + FacesMessage msg = new FacesMessage(""); + ValidatorException validatorException = new ValidatorException(msg); + if (value == null) { + throw validatorException; + } + if (op1 + op2 != (Long) value) { + throw validatorException; + } + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/MailServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/MailServiceBean.java index 09093899cc4..988a1b282de 100644 --- a/src/main/java/edu/harvard/iq/dataverse/MailServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/MailServiceBean.java @@ -126,7 +126,7 @@ public boolean sendSystemEmail(String to, String subject, String messageText) { InternetAddress[] recipients = new InternetAddress[recipientStrings.length]; for (int i = 0; i < recipients.length; i++) { try { - recipients[i] = new InternetAddress('"' + recipientStrings[i] + '"', "", charset); + recipients[i] = new InternetAddress(recipientStrings[i], "", charset); } catch (UnsupportedEncodingException ex) { logger.severe(ex.getMessage()); } diff --git a/src/main/java/edu/harvard/iq/dataverse/PermissionServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/PermissionServiceBean.java index 2b6666771a1..b15c8e2b28e 100644 --- a/src/main/java/edu/harvard/iq/dataverse/PermissionServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/PermissionServiceBean.java @@ -26,6 +26,11 @@ import javax.persistence.PersistenceContext; import static edu.harvard.iq.dataverse.engine.command.CommandHelper.CH; import edu.harvard.iq.dataverse.engine.command.DataverseRequest; +import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; +import edu.harvard.iq.dataverse.engine.command.impl.CreateDatasetCommand; +import edu.harvard.iq.dataverse.engine.command.impl.PublishDatasetCommand; +import edu.harvard.iq.dataverse.engine.command.impl.UpdateDatasetCommand; +import edu.harvard.iq.dataverse.util.BundleUtil; import java.util.Arrays; import java.util.HashMap; import java.util.LinkedList; @@ -206,8 +211,7 @@ public Set permissionsFor( DataverseRequest req, DvObject dvo ) { // Add permissions specifically given to the user permissions.addAll( permissionsForSingleRoleAssignee(req.getUser(),dvo) ); - - /* + Set groups = groupService.groupsFor(req,dvo); // Add permissions gained from groups @@ -215,8 +219,7 @@ public Set permissionsFor( DataverseRequest req, DvObject dvo ) { final Set groupPremissions = permissionsForSingleRoleAssignee(g,dvo); permissions.addAll(groupPremissions); } - */ - + if ( ! req.getUser().isAuthenticated() ) { permissions.removeAll( PERMISSIONS_FOR_AUTHENTICATED_USERS_ONLY ); } @@ -539,6 +542,48 @@ public List getDvObjectIdsUserHasRoleOn(User user, List rol return dataversesUserHasPermissionOn; } + public void checkEditDatasetLock(Dataset dataset, DataverseRequest dataverseRequest, Command command) throws IllegalCommandException { + if (dataset.isLocked()) { + if (dataset.isLockedFor(DatasetLock.Reason.InReview)) { + // The "InReview" lock is not really a lock for curators. They can still make edits. + if (!isUserAllowedOn(dataverseRequest.getUser(), new PublishDatasetCommand(dataset, dataverseRequest, true), dataset)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.editNotAllowedInReview"), command); + } + } + if (dataset.isLockedFor(DatasetLock.Reason.Ingest)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.editNotAllowed"), command); + } + // TODO: Do we need to check for "Workflow"? Should the message be more specific? + if (dataset.isLockedFor(DatasetLock.Reason.Workflow)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.editNotAllowed"), command); + } + // TODO: Do we need to check for "DcmUpload"? Should the message be more specific? + if (dataset.isLockedFor(DatasetLock.Reason.DcmUpload)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.editNotAllowed"), command); + } + } + } - + public void checkDownloadFileLock(Dataset dataset, DataverseRequest dataverseRequest, Command command) throws IllegalCommandException { + if (dataset.isLocked()) { + if (dataset.isLockedFor(DatasetLock.Reason.InReview)) { + // The "InReview" lock is not really a lock for curators or contributors. They can still download. + if (!isUserAllowedOn(dataverseRequest.getUser(), new UpdateDatasetCommand(dataset, dataverseRequest), dataset)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.downloadNotAllowedInReview"), command); + } + } + if (dataset.isLockedFor(DatasetLock.Reason.Ingest)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.downloadNotAllowed"), command); + } + // TODO: Do we need to check for "Workflow"? Should the message be more specific? + if (dataset.isLockedFor(DatasetLock.Reason.Workflow)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.downloadNotAllowed"), command); + } + // TODO: Do we need to check for "DcmUpload"? Should the message be more specific? + if (dataset.isLockedFor(DatasetLock.Reason.DcmUpload)) { + throw new IllegalCommandException(BundleUtil.getStringFromBundle("dataset.message.locked.downloadNotAllowed"), command); + } + } + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/SendFeedbackDialog.java b/src/main/java/edu/harvard/iq/dataverse/SendFeedbackDialog.java index d68a610bd1a..67d6e673438 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SendFeedbackDialog.java +++ b/src/main/java/edu/harvard/iq/dataverse/SendFeedbackDialog.java @@ -31,6 +31,7 @@ public class SendFeedbackDialog implements java.io.Serializable { private String userMessage = ""; private String messageSubject = ""; private String messageTo = ""; + // FIXME: Remove "support@thedata.org". There's no reason to email the Dataverse *project*. People should email the *installation* instead. private String defaultRecipientEmail = "support@thedata.org"; Long op1, op2, userSum; // Either the dataverse or the dataset that the message is pertaining to @@ -161,6 +162,7 @@ public void validateUserSum(FacesContext context, UIComponent component, Object if (op1 + op2 !=(Long)value) { + // TODO: Remove this English "Sum is incorrect" string. contactFormFragment.xhtml uses contact.sum.invalid instead. FacesMessage msg = new FacesMessage("Sum is incorrect, please try again."); msg.setSeverity(FacesMessage.SEVERITY_ERROR); diff --git a/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java b/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java index 75ae1e00b96..0dfa2d67885 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java +++ b/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java @@ -155,5 +155,9 @@ public String getSupportTeamName() { return BrandingUtil.getSupportTeamName(systemAddress, dataverseService.findRootDataverse().getName()); } + public boolean isRootDataverseThemeDisabled() { + return isTrueForKey(Key.DisableRootDataverseTheme, false); + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/Shib.java b/src/main/java/edu/harvard/iq/dataverse/Shib.java index 9945d10c916..b5a8b6d6a51 100644 --- a/src/main/java/edu/harvard/iq/dataverse/Shib.java +++ b/src/main/java/edu/harvard/iq/dataverse/Shib.java @@ -13,6 +13,7 @@ import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.JsfHelper; +import edu.harvard.iq.dataverse.util.SystemConfig; import java.io.IOException; import java.sql.Timestamp; import java.util.ArrayList; @@ -441,9 +442,9 @@ public String getPrettyFacesHomePageString(boolean includeFacetDashRedirect) { String rootDvAlias = getRootDataverseAlias(); if (includeFacetDashRedirect) { if (rootDvAlias != null) { - return plainHomepageString + "?alias=" + rootDvAlias + "&faces-redirect=true"; + return plainHomepageString + "?alias=" + rootDvAlias + "&faces-redirect=true"; } else { - return plainHomepageString + "?faces-redirect=true"; + return plainHomepageString + "?faces-redirect=true"; } } else if (rootDvAlias != null) { /** diff --git a/src/main/java/edu/harvard/iq/dataverse/ThemeWidgetFragment.java b/src/main/java/edu/harvard/iq/dataverse/ThemeWidgetFragment.java index 8f9c79f4dc8..d9cecd71343 100644 --- a/src/main/java/edu/harvard/iq/dataverse/ThemeWidgetFragment.java +++ b/src/main/java/edu/harvard/iq/dataverse/ThemeWidgetFragment.java @@ -44,8 +44,8 @@ @ViewScoped @Named public class ThemeWidgetFragment implements java.io.Serializable { - static final String DEFAULT_LOGO_BACKGROUND_COLOR = "F5F5F5"; - static final String DEFAULT_BACKGROUND_COLOR = "F5F5F5"; + static final String DEFAULT_LOGO_BACKGROUND_COLOR = "FFFFFF"; + static final String DEFAULT_BACKGROUND_COLOR = "FFFFFF"; static final String DEFAULT_LINK_COLOR = "428BCA"; static final String DEFAULT_TEXT_COLOR = "888888"; private static final Logger logger = Logger.getLogger(ThemeWidgetFragment.class.getCanonicalName()); @@ -262,7 +262,7 @@ public void resetForm() { } public String cancel() { - return "dataverse?faces-redirect=true&alias="+editDv.getAlias(); // go to dataverse page + return "dataverse.xhtml?faces-redirect=true&alias="+editDv.getAlias(); // go to dataverse page } @@ -285,7 +285,7 @@ public String save() { this.cleanupTempDirectory(); } JsfHelper.addSuccessMessage(JH.localize("dataverse.theme.success")); - return "dataverse?faces-redirect=true&alias="+editDv.getAlias(); // go to dataverse page + return "dataverse.xhtml?faces-redirect=true&alias="+editDv.getAlias(); // go to dataverse page } } diff --git a/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogRecord.java b/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogRecord.java index fd11dbdd0af..6b3ca20a016 100644 --- a/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogRecord.java +++ b/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogRecord.java @@ -41,6 +41,8 @@ public enum ActionType { Auth, Admin, + + ExternalTool, GlobalGroups } @@ -70,6 +72,11 @@ public enum ActionType { public ActionLogRecord(){} + /** + * @param anActionType + * @param anActionSubType + */ + // TODO: Add ability to set `info` in constructor. public ActionLogRecord( ActionType anActionType, String anActionSubType ) { actionType = anActionType; actionSubType = anActionSubType; diff --git a/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogServiceBean.java index 0273dcd77a9..f76d3b52e22 100644 --- a/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/actionlogging/ActionLogServiceBean.java @@ -32,4 +32,5 @@ public void log( ActionLogRecord rec ) { } em.persist(rec); } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java b/src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java index 0f7f470277e..67c22bf2600 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java @@ -1,5 +1,6 @@ package edu.harvard.iq.dataverse.api; +import edu.harvard.iq.dataverse.DataFileServiceBean; import edu.harvard.iq.dataverse.Dataset; import edu.harvard.iq.dataverse.DatasetFieldServiceBean; import edu.harvard.iq.dataverse.DatasetFieldType; @@ -32,6 +33,7 @@ import edu.harvard.iq.dataverse.engine.command.exception.CommandException; import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; import edu.harvard.iq.dataverse.engine.command.exception.PermissionException; +import edu.harvard.iq.dataverse.externaltools.ExternalToolServiceBean; import edu.harvard.iq.dataverse.privateurl.PrivateUrlServiceBean; import edu.harvard.iq.dataverse.search.savedsearch.SavedSearchServiceBean; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; @@ -211,6 +213,12 @@ String getWrappedMessageWhenJson() { @EJB protected PasswordValidatorServiceBean passwordValidatorService; + @EJB + protected ExternalToolServiceBean externalToolService; + + @EJB + DataFileServiceBean fileSvc; + @PersistenceContext(unitName = "VDCNet-ejbPU") protected EntityManager em; diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Access.java b/src/main/java/edu/harvard/iq/dataverse/api/Access.java index 53de5a4663e..e7e76f8d985 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Access.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Access.java @@ -129,9 +129,10 @@ public class Access extends AbstractApiBean { @Path("datafile/bundle/{fileId}") @GET @Produces({"application/zip"}) - public BundleDownloadInstance datafileBundle(@PathParam("fileId") Long fileId, @QueryParam("key") String apiToken, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ { + public BundleDownloadInstance datafileBundle(@PathParam("fileId") Long fileId, @QueryParam("gbrecs") Boolean gbrecs, @QueryParam("key") String apiToken, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ { DataFile df = dataFileService.find(fileId); + GuestbookResponse gbr = null; if (df == null) { logger.warning("Access: datafile service could not locate a DataFile object for id "+fileId+"!"); @@ -146,6 +147,13 @@ public BundleDownloadInstance datafileBundle(@PathParam("fileId") Long fileId, @ // exit code, if access isn't authorized: checkAuthorization(df, apiToken); + if (gbrecs == null && df.isReleased()){ + // Write Guestbook record if not done previously and file is released + User apiTokenUser = findAPITokenUser(apiToken); + gbr = guestbookResponseService.initAPIGuestbookResponse(df.getOwner(), df, session, apiTokenUser); + guestbookResponseService.save(gbr); + } + DownloadInfo dInfo = new DownloadInfo(df); BundleDownloadInstance downloadInstance = new BundleDownloadInstance(dInfo); @@ -179,25 +187,32 @@ public BundleDownloadInstance datafileBundle(@PathParam("fileId") Long fileId, @ @Path("datafile/{fileId}") @GET @Produces({ "application/xml" }) - public DownloadInstance datafile(@PathParam("fileId") Long fileId, @QueryParam("gbrecs") Boolean gbrecs, @QueryParam("key") String apiToken, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ { + public DownloadInstance datafile(@PathParam("fileId") Long fileId, @QueryParam("gbrecs") Boolean gbrecs, @QueryParam("key") String apiToken, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) { DataFile df = dataFileService.find(fileId); GuestbookResponse gbr = null; - /* - if (gbrecs == null && df.isReleased()){ - //commenting out for 4.6 SEK - // gbr = guestbookResponseService.initDefaultGuestbookResponse(df.getOwner(), df, session); - } - */ + if (df == null) { logger.warning("Access: datafile service could not locate a DataFile object for id "+fileId+"!"); throw new WebApplicationException(Response.Status.NOT_FOUND); } + if (df.isHarvested()) { + throw new WebApplicationException(Response.Status.NOT_FOUND); + // (nobody should ever be using this API on a harvested DataFile)! + } + if (apiToken == null || apiToken.equals("")) { apiToken = headers.getHeaderString(API_KEY_HEADER); } + + if (gbrecs == null && df.isReleased()){ + // Write Guestbook record if not done previously and file is released + User apiTokenUser = findAPITokenUser(apiToken); + gbr = guestbookResponseService.initAPIGuestbookResponse(df.getOwner(), df, session, apiTokenUser); + } + // This will throw a WebApplicationException, with the correct // exit code, if access isn't authorized: checkAuthorization(df, apiToken); @@ -435,13 +450,8 @@ public DownloadInstance tabularDatafileMetadataPreprocessed(@PathParam("fileId") @Path("datafiles/{fileIds}") @GET @Produces({"application/zip"}) - public /*ZippedDownloadInstance*/ Response datafiles(@PathParam("fileIds") String fileIds, @QueryParam("key") String apiTokenParam, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) throws WebApplicationException /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ { - // create a Download Instance without, without a primary Download Info object: - //ZippedDownloadInstance downloadInstance = new ZippedDownloadInstance(); + public Response datafiles(@PathParam("fileIds") String fileIds, @QueryParam("gbrecs") Boolean gbrecs, @QueryParam("key") String apiTokenParam, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) throws WebApplicationException /*throws NotFoundException, ServiceUnavailableException, PermissionDeniedException, AuthorizationRequiredException*/ { - - - long setLimit = systemConfig.getZipDownloadLimit(); if (!(setLimit > 0L)) { setLimit = DataFileZipper.DEFAULT_ZIPFILE_LIMIT; @@ -459,6 +469,8 @@ public DownloadInstance tabularDatafileMetadataPreprocessed(@PathParam("fileId") ? headers.getHeaderString(API_KEY_HEADER) : apiTokenParam; + User apiTokenUser = findAPITokenUser(apiToken); //for use in adding gb records if necessary + StreamingOutput stream = new StreamingOutput() { @Override @@ -493,7 +505,10 @@ public void write(OutputStream os) throws IOException, } logger.fine("adding datafile (id=" + file.getId() + ") to the download list of the ZippedDownloadInstance."); //downloadInstance.addDataFile(file); - + if (gbrecs == null && file.isReleased()){ + GuestbookResponse gbr = guestbookResponseService.initAPIGuestbookResponse(file.getOwner(), file, session, apiTokenUser); + guestbookResponseService.save(gbr); + } if (zipper == null) { // This is the first file we can serve - so we now know that we are going to be able // to produce some output. @@ -548,20 +563,6 @@ public void write(OutputStream os) throws IOException, return Response.ok(stream).build(); } - - /* - * Geting rid of the tempPreview API - it's always been a big, fat hack. - * the edit files page is now using the Base64 image strings in the preview - * URLs, just like the search and dataset pages. - @Path("tempPreview/{fileSystemId}") - @GET - @Produces({"image/png"}) - public InputStream tempPreview(@PathParam("fileSystemId") String fileSystemId, @Context UriInfo uriInfo, @Context HttpHeaders headers, @Context HttpServletResponse response) { - - }*/ - - - @Path("fileCardImage/{fileId}") @GET @Produces({ "image/png" }) @@ -1135,5 +1136,32 @@ private boolean isAccessAuthorized(DataFile df, String apiToken) { return false; } + + + + private User findAPITokenUser(String apiToken) { + User apiTokenUser = null; + + if ((apiToken != null) && (apiToken.length() != 64)) { + // We'll also try to obtain the user information from the API token, + // if supplied: + + try { + logger.fine("calling apiTokenUser = findUserOrDie()..."); + apiTokenUser = findUserOrDie(); + return apiTokenUser; + } catch (WrappedResponse wr) { + logger.log(Level.FINE, "Message from findUserOrDie(): {0}", wr.getMessage()); + return null; + } + + } + return apiTokenUser; + } + + + + + } \ No newline at end of file diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Admin.java b/src/main/java/edu/harvard/iq/dataverse/api/Admin.java index 92086575199..663bfd4578f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Admin.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Admin.java @@ -63,8 +63,6 @@ import edu.harvard.iq.dataverse.ingest.IngestServiceBean; import edu.harvard.iq.dataverse.userdata.UserListMaker; import edu.harvard.iq.dataverse.userdata.UserListResult; -import edu.harvard.iq.dataverse.util.StringUtil; -import java.math.BigDecimal; import java.util.Date; import java.util.ResourceBundle; import javax.inject.Inject; @@ -246,7 +244,7 @@ public Response checkAuthenticationProviderEnabled(@PathParam("id")String id){ return ok(Boolean.toString(prvs.get(0).isEnabled())); } } - + @DELETE @Path("authenticationProviders/{id}/") public Response deleteAuthenticationProvider( @PathParam("id") String id ) { @@ -1002,4 +1000,10 @@ public Response validatePassword(String password) { .add("errors", errorArray) ); } + + @GET + @Path("/isOrcid") + public Response isOrcidEnabled() { + return authSvc.isOrcidEnabled() ? ok("Orcid is enabled") : ok("no orcid for you."); + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java b/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java index 233d57e1b45..633623719a4 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java @@ -10,6 +10,7 @@ import edu.harvard.iq.dataverse.authorization.providers.builtin.PasswordEncryption; import edu.harvard.iq.dataverse.authorization.users.ApiToken; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import java.sql.Timestamp; import java.util.Calendar; import java.util.logging.Level; @@ -53,6 +54,14 @@ public class BuiltinUsers extends AbstractApiBean { @GET @Path("{username}/api-token") public Response getApiToken( @PathParam("username") String username, @QueryParam("password") String password ) { + boolean disabled = true; + boolean lookupAllowed = settingsSvc.isTrueForKey(SettingsServiceBean.Key.AllowApiTokenLookupViaApi, false); + if (lookupAllowed) { + disabled = false; + } + if (disabled) { + return error(Status.FORBIDDEN, "This API endpoint has been disabled."); + } BuiltinUser u = null; if (retrievingApiTokenViaEmailEnabled) { u = builtinUserSvc.findByUsernameOrEmail(username); diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java index 7b072aadd3a..e38a9fd3ca5 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java @@ -54,6 +54,7 @@ import edu.harvard.iq.dataverse.engine.command.impl.ImportFromFileSystemCommand; import edu.harvard.iq.dataverse.engine.command.impl.ListRoleAssignments; import edu.harvard.iq.dataverse.engine.command.impl.ListVersionsCommand; +import edu.harvard.iq.dataverse.engine.command.impl.MoveDatasetCommand; import edu.harvard.iq.dataverse.engine.command.impl.PublishDatasetCommand; import edu.harvard.iq.dataverse.engine.command.impl.PublishDatasetResult; import edu.harvard.iq.dataverse.engine.command.impl.RequestRsyncScriptCommand; @@ -405,26 +406,6 @@ public Response publishDataseUsingGetDeprecated( @PathParam("id") String id, @Qu return publishDataset(id, type); } - // TODO SBG: Delete me - @EJB - WorkflowServiceBean workflows; - - @PUT - @Path("{id}/actions/wf/{wfid}") - public Response DELETEME(@PathParam("id") String id, @PathParam("wfid") String wfid) { - try { - Workflow wf = workflows.getWorkflow(Long.parseLong(wfid)).get(); - Dataset ds = findDatasetOrDie(id); - WorkflowContext ctxt = new WorkflowContext(createDataverseRequest(findUserOrDie()), ds, 0, 0, WorkflowContext.TriggerType.PostPublishDataset, "DataCite"); - workflows.start(wf, ctxt); - return ok("Started workflow " + wf.getName() + " on dataset " + ds.getId() ); - - } catch (WrappedResponse ex) { - return ex.getResponse(); - } - } - // TODO SBG: /Delete me - @POST @Path("{id}/actions/:publish") public Response publishDataset(@PathParam("id") String id, @QueryParam("type") String type) { @@ -456,7 +437,27 @@ public Response publishDataset(@PathParam("id") String id, @QueryParam("type") S return ex.getResponse(); } } - + + @POST + @Path("{id}/move/{targetDataverseAlias}") + public Response moveDataset(@PathParam("id") String id, @PathParam("targetDataverseAlias") String targetDataverseAlias, @QueryParam("forceMove") Boolean force) { + try{ + System.out.print("force: " + force); + User u = findUserOrDie(); + Dataset ds = findDatasetOrDie(id); + Dataverse target = dataverseService.findByAlias(targetDataverseAlias); + if (target == null){ + return error(Response.Status.BAD_REQUEST, "Target Dataverse not found."); + } + //Command requires Super user - it will be tested by the command + execCommand(new MoveDatasetCommand( + createDataverseRequest(u), ds, target, force + )); + return ok("Dataset moved successfully"); + } catch (WrappedResponse ex) { + return ex.getResponse(); + } + } @GET @Path("{id}/links") public Response getLinks(@PathParam("id") String idSupplied ) { @@ -655,7 +656,7 @@ public Response getRsync(@PathParam("identifier") String id) { @POST @Path("{identifier}/dataCaptureModule/checksumValidation") public Response receiveChecksumValidationResults(@PathParam("identifier") String id, JsonObject jsonFromDcm) { - logger.fine("jsonFromDcm: " + jsonFromDcm); + logger.log(Level.FINE, "jsonFromDcm: {0}", jsonFromDcm); AuthenticatedUser authenticatedUser = null; try { authenticatedUser = findAuthenticatedUserOrDie(); @@ -712,13 +713,7 @@ public Response submitForReview(@PathParam("id") String idSupplied) { Dataset updatedDataset = execCommand(new SubmitDatasetForReviewCommand(createDataverseRequest(findUserOrDie()), findDatasetOrDie(idSupplied))); JsonObjectBuilder result = Json.createObjectBuilder(); - boolean inReview = false; - try{ - inReview = updatedDataset.getDatasetLock().getReason().equals(DatasetLock.Reason.InReview); - } catch (Exception e){ - System.out.print("submit exception: " + e.getMessage()); - // if there's no lock then it can't be in review by definition - } + boolean inReview = updatedDataset.isLockedFor(DatasetLock.Reason.InReview); result.add("inReview", inReview); result.add("message", "Dataset id " + updatedDataset.getId() + " has been submitted for review."); @@ -747,12 +742,7 @@ public Response returnToAuthor(@PathParam("id") String idSupplied, String jsonBo } AuthenticatedUser authenticatedUser = findAuthenticatedUserOrDie(); Dataset updatedDataset = execCommand(new ReturnDatasetToAuthorCommand(createDataverseRequest(authenticatedUser), dataset, reasonForReturn )); - boolean inReview = false; - try{ - inReview = updatedDataset.getDatasetLock().getReason().equals(DatasetLock.Reason.InReview); - } catch (Exception e){ - // if there's no lock then it can't be in review by definition - } + boolean inReview = updatedDataset.isLockedFor(DatasetLock.Reason.InReview); JsonObjectBuilder result = Json.createObjectBuilder(); result.add("inReview", inReview); @@ -767,9 +757,8 @@ public Response returnToAuthor(@PathParam("id") String idSupplied, String jsonBo * Add a File to an existing Dataset * * @param idSupplied - * @param datasetId * @param jsonData - * @param testFileInputStream + * @param fileInputStream * @param contentDispositionHeader * @param formDataBodyPart * @return diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java index bb13ced99c6..b75f8ae4f19 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java @@ -97,10 +97,6 @@ public class Dataverses extends AbstractApiBean { @Deprecated private static final Logger LOGGER = Logger.getLogger(Dataverses.class.getName()); private static final Logger logger = Logger.getLogger(Dataverses.class.getCanonicalName()); -// static final String DEFAULT_LOGO_BACKGROUND_COLOR = "F5F5F5"; -// static final String DEFAULT_BACKGROUND_COLOR = "F5F5F5"; -// static final String DEFAULT_LINK_COLOR = "428BCA"; -// static final String DEFAULT_TEXT_COLOR = "888888"; @EJB ExplicitGroupServiceBean explicitGroupSvc; @@ -245,8 +241,11 @@ public Response createDataset( String jsonBody, @PathParam("identifier") String } Dataset managedDs = execCommand(new CreateDatasetCommand(ds, createDataverseRequest(u))); - return created( "/datasets/" + managedDs.getId(), - Json.createObjectBuilder().add("id", managedDs.getId()) ); + return created("/datasets/" + managedDs.getId(), + Json.createObjectBuilder() + .add("id", managedDs.getId()) + .add("persistentId", managedDs.getGlobalId()) + ); } catch ( WrappedResponse ex ) { return ex.getResponse(); diff --git a/src/main/java/edu/harvard/iq/dataverse/api/DownloadInstanceWriter.java b/src/main/java/edu/harvard/iq/dataverse/api/DownloadInstanceWriter.java index df4cffae28c..4081d710389 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/DownloadInstanceWriter.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/DownloadInstanceWriter.java @@ -29,9 +29,12 @@ import edu.harvard.iq.dataverse.engine.command.impl.CreateGuestbookResponseCommand; import java.io.File; import java.io.FileInputStream; +import java.net.URI; +import java.net.URISyntaxException; import java.util.ArrayList; import java.util.List; import java.util.logging.Logger; +import javax.ws.rs.RedirectionException; /** * @@ -206,6 +209,44 @@ public void writeTo(DownloadInstance di, Class clazz, Type type, Annotation[] if (storageIO == null) { throw new WebApplicationException(Response.Status.SERVICE_UNAVAILABLE); } + } else { + if (storageIO instanceof S3AccessIO && !(dataFile.isTabularData()) && isRedirectToS3()) { + // [attempt to] redirect: + String redirect_url_str = ((S3AccessIO)storageIO).generateTemporaryS3Url(); + // better exception handling here? + logger.info("Data Access API: direct S3 url: "+redirect_url_str); + URI redirect_uri; + + try { + redirect_uri = new URI(redirect_url_str); + } catch (URISyntaxException ex) { + logger.info("Data Access API: failed to create S3 redirect url ("+redirect_url_str+")"); + redirect_uri = null; + } + if (redirect_uri != null) { + // definitely close the (still open) S3 input stream, + // since we are not going to use it. The S3 documentation + // emphasizes that it is very important not to leave these + // lying around un-closed, since they are going to fill + // up the S3 connection pool! + storageIO.getInputStream().close(); + + // increment the download count, if necessary: + if (di.getGbr() != null) { + try { + logger.fine("writing guestbook response, for an S3 download redirect."); + Command cmd = new CreateGuestbookResponseCommand(di.getDataverseRequestService().getDataverseRequest(), di.getGbr(), di.getGbr().getDataFile().getOwner()); + di.getCommand().submit(cmd); + } catch (CommandException e) { + } + } + + // finally, issue the redirect: + Response response = Response.seeOther(redirect_uri).build(); + logger.info("Issuing redirect to the file location on S3."); + throw new RedirectionException(response); + } + } } InputStream instream = storageIO.getInputStream(); @@ -284,13 +325,10 @@ public void writeTo(DownloadInstance di, Class clazz, Type type, Annotation[] logger.fine("writing guestbook response."); Command cmd = new CreateGuestbookResponseCommand(di.getDataverseRequestService().getDataverseRequest(), di.getGbr(), di.getGbr().getDataFile().getOwner()); di.getCommand().submit(cmd); - } catch (CommandException e) { - //if an error occurs here then download won't happen no need for response recs... - } + } catch (CommandException e) {} } else { logger.fine("not writing guestbook response"); } - instream.close(); outstream.close(); @@ -376,5 +414,13 @@ private long getFileSize(DownloadInstance di, String extraHeader) { } return -1; } + + private boolean isRedirectToS3() { + String optionValue = System.getProperty("dataverse.files.s3-download-redirect"); + if ("true".equalsIgnoreCase(optionValue)) { + return true; + } + return false; + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/ExternalTools.java b/src/main/java/edu/harvard/iq/dataverse/api/ExternalTools.java new file mode 100644 index 00000000000..d809e047b31 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/api/ExternalTools.java @@ -0,0 +1,74 @@ +package edu.harvard.iq.dataverse.api; + +import edu.harvard.iq.dataverse.DataFile; +import edu.harvard.iq.dataverse.actionlogging.ActionLogRecord; +import static edu.harvard.iq.dataverse.api.AbstractApiBean.error; +import edu.harvard.iq.dataverse.externaltools.ExternalTool; +import edu.harvard.iq.dataverse.externaltools.ExternalToolServiceBean; +import java.util.List; +import javax.json.Json; +import javax.json.JsonArrayBuilder; +import javax.ws.rs.DELETE; +import javax.ws.rs.GET; +import javax.ws.rs.POST; +import javax.ws.rs.Path; +import javax.ws.rs.PathParam; +import javax.ws.rs.core.Response; +import static javax.ws.rs.core.Response.Status.BAD_REQUEST; + +@Path("admin/externalTools") +public class ExternalTools extends AbstractApiBean { + + @GET + public Response getExternalTools() { + JsonArrayBuilder jab = Json.createArrayBuilder(); + externalToolService.findAll().forEach((externalTool) -> { + jab.add(externalTool.toJson()); + }); + return ok(jab); + } + + @POST + public Response addExternalTool(String manifest) { + try { + ExternalTool externalTool = ExternalToolServiceBean.parseAddExternalToolManifest(manifest); + ExternalTool saved = externalToolService.save(externalTool); + Long toolId = saved.getId(); + actionLogSvc.log(new ActionLogRecord(ActionLogRecord.ActionType.ExternalTool, "addExternalTool").setInfo("External tool added with id " + toolId + ".")); + return ok(saved.toJson()); + } catch (Exception ex) { + return error(BAD_REQUEST, ex.getMessage()); + } + + } + + @DELETE + @Path("{id}") + public Response deleteExternalTool(@PathParam("id") long externalToolIdFromUser) { + boolean deleted = externalToolService.delete(externalToolIdFromUser); + if (deleted) { + return ok("Deleted external tool with id of " + externalToolIdFromUser); + } else { + return error(BAD_REQUEST, "Could not not delete external tool with id of " + externalToolIdFromUser); + } + } + + @GET + @Path("file/{id}") + public Response getExternalToolsByFile(@PathParam("id") Long fileIdFromUser) { + DataFile dataFile = fileSvc.find(fileIdFromUser); + if (dataFile == null) { + return error(BAD_REQUEST, "Could not find datafile with id " + fileIdFromUser); + } + JsonArrayBuilder tools = Json.createArrayBuilder(); + + List allExternalTools = externalToolService.findAll(); + List toolsByFile = ExternalToolServiceBean.findExternalToolsByFile(allExternalTools, dataFile); + for (ExternalTool tool : toolsByFile) { + tools.add(tool.toJson()); + } + + return ok(tools); + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Info.java b/src/main/java/edu/harvard/iq/dataverse/api/Info.java index bd24ea8923c..a2f9d7a5217 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Info.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Info.java @@ -46,4 +46,10 @@ public Response getInfo() { public Response getServer() { return response( req -> ok(systemConfig.getDataverseServer())); } + + @GET + @Path("apiTermsOfUse") + public Response getTermsOfUse() { + return allowCors(response( req -> ok(systemConfig.getApiTermsOfUse()))); + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/ContainerManagerImpl.java b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/ContainerManagerImpl.java index 3409f419969..5301024afa1 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/ContainerManagerImpl.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/ContainerManagerImpl.java @@ -129,7 +129,6 @@ public DepositReceipt replaceMetadata(String uri, Deposit deposit, AuthCredentia String globalId = urlManager.getTargetIdentifier(); Dataset dataset = datasetService.findByGlobalId(globalId); if (dataset != null) { - SwordUtil.datasetLockCheck(dataset); Dataverse dvThatOwnsDataset = dataset.getOwner(); UpdateDatasetCommand updateDatasetCommand = new UpdateDatasetCommand(dataset, dvReq); if (!permissionService.isUserAllowedOn(user, updateDatasetCommand, dataset)) { @@ -222,7 +221,6 @@ public void deleteContainer(String uri, AuthCredentials authCredentials, SwordCo if (!permissionService.isUserAllowedOn(user, deleteDatasetVersionCommand, dataset)) { throw new SwordError(UriRegistry.ERROR_BAD_REQUEST, "User " + user.getDisplayInfo().getTitle() + " is not authorized to modify " + dvThatOwnsDataset.getAlias()); } - SwordUtil.datasetLockCheck(dataset); DatasetVersion.VersionState datasetVersionState = dataset.getLatestVersion().getVersionState(); if (dataset.isReleased()) { if (datasetVersionState.equals(DatasetVersion.VersionState.DRAFT)) { diff --git a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/MediaResourceManagerImpl.java b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/MediaResourceManagerImpl.java index 714883c9c33..c79e8660329 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/MediaResourceManagerImpl.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/MediaResourceManagerImpl.java @@ -159,7 +159,6 @@ public void deleteMediaResource(String uri, AuthCredentials authCredentials, Swo DataFile fileToDelete = dataFileService.find(fileIdLong); if (fileToDelete != null) { Dataset dataset = fileToDelete.getOwner(); - SwordUtil.datasetLockCheck(dataset); Dataset datasetThatOwnsFile = fileToDelete.getOwner(); Dataverse dataverseThatOwnsFile = datasetThatOwnsFile.getOwner(); /** @@ -216,7 +215,6 @@ DepositReceipt replaceOrAddFiles(String uri, Deposit deposit, AuthCredentials au if (!permissionService.isUserAllowedOn(user, updateDatasetCommand, dataset)) { throw new SwordError(UriRegistry.ERROR_BAD_REQUEST, "user " + user.getDisplayInfo().getTitle() + " is not authorized to modify dataset with global ID " + dataset.getGlobalId()); } - SwordUtil.datasetLockCheck(dataset); //--------------------------------------- // Make sure that the upload type is not rsync diff --git a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/StatementManagerImpl.java b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/StatementManagerImpl.java index 5089204f854..f6c9bcca18c 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/StatementManagerImpl.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/StatementManagerImpl.java @@ -14,7 +14,9 @@ import java.util.HashMap; import java.util.List; import java.util.Map; +import java.util.Optional; import java.util.logging.Logger; +import static java.util.stream.Collectors.joining; import javax.ejb.EJB; import javax.inject.Inject; import javax.servlet.http.HttpServletRequest; @@ -91,14 +93,16 @@ public Statement getStatement(String editUri, Map map, AuthCrede states.put("latestVersionState", dataset.getLatestVersion().getVersionState().toString()); Boolean isMinorUpdate = dataset.getLatestVersion().isMinorUpdate(); states.put("isMinorUpdate", isMinorUpdate.toString()); - DatasetLock lock = dataset.getDatasetLock(); - if (lock != null) { + + if ( dataset.isLocked() ) { states.put("locked", "true"); - states.put("lockedDetail", lock.getInfo()); - states.put("lockedStartTime", lock.getStartTime().toString()); + states.put("lockedDetail", dataset.getLocks().stream().map( l-> l.getInfo() ).collect( joining(",")) ); + Optional earliestLock = dataset.getLocks().stream().min((l1, l2) -> (int)Math.signum(l1.getStartTime().getTime()-l2.getStartTime().getTime()) ); + states.put("lockedStartTime", earliestLock.get().getStartTime().toString()); } else { states.put("locked", "false"); } + statement.setStates(states); List fileMetadatas = dataset.getLatestVersion().getFileMetadatas(); for (FileMetadata fileMetadata : fileMetadatas) { diff --git a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/SwordUtil.java b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/SwordUtil.java index a35acfb200e..39575fb6fdb 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/SwordUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/datadeposit/SwordUtil.java @@ -1,7 +1,5 @@ package edu.harvard.iq.dataverse.api.datadeposit; -import edu.harvard.iq.dataverse.Dataset; -import edu.harvard.iq.dataverse.DatasetLock; import org.swordapp.server.SwordError; import org.swordapp.server.UriRegistry; @@ -12,7 +10,7 @@ public class SwordUtil { static String DCTERMS = "http://purl.org/dc/terms/"; - /** + /* * @todo get rid of this method */ public static SwordError throwSpecialSwordErrorWithoutStackTrace(String SwordUriRegistryError, String error) { @@ -28,7 +26,7 @@ public static SwordError throwSpecialSwordErrorWithoutStackTrace(String SwordUri return swordError; } - /** + /* * @todo get rid of this method */ public static SwordError throwRegularSwordErrorWithoutStackTrace(String error) { @@ -41,12 +39,4 @@ public static SwordError throwRegularSwordErrorWithoutStackTrace(String error) { return swordError; } - public static void datasetLockCheck(Dataset dataset) throws SwordError { - DatasetLock datasetLock = dataset.getDatasetLock(); - if (datasetLock != null) { - String message = "Please try again later. Unable to perform operation due to dataset lock: " + datasetLock.getInfo(); - throw new SwordError(UriRegistry.ERROR_BAD_REQUEST, message); - } - } - } diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticatedUserDisplayInfo.java b/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticatedUserDisplayInfo.java index 739c0c915bf..909e124e5bd 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticatedUserDisplayInfo.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticatedUserDisplayInfo.java @@ -15,7 +15,7 @@ public class AuthenticatedUserDisplayInfo extends RoleAssigneeDisplayInfo { private String firstName; private String position; - /** + /* * @todo Shouldn't we persist the displayName too? It still exists on the * authenticateduser table. */ diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java index 401a5d0a932..8eadbe70221 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java @@ -47,6 +47,7 @@ import javax.ejb.EJB; import javax.ejb.EJBException; import javax.ejb.Singleton; +import javax.inject.Named; import javax.persistence.EntityManager; import javax.persistence.NoResultException; import javax.persistence.NonUniqueResultException; @@ -63,6 +64,7 @@ * * Register the providers in the {@link #startup()} method. */ +@Named @Singleton public class AuthenticationServiceBean { private static final Logger logger = Logger.getLogger(AuthenticationServiceBean.class.getName()); @@ -114,7 +116,8 @@ public void startup() { registerProviderFactory( new BuiltinAuthenticationProviderFactory(builtinUserServiceBean, passwordValidatorService) ); registerProviderFactory( new ShibAuthenticationProviderFactory() ); registerProviderFactory( new OAuth2AuthenticationProviderFactory() ); - } catch (AuthorizationSetupException ex) { + + } catch (AuthorizationSetupException ex) { logger.log(Level.SEVERE, "Exception setting up the authentication provider factories: " + ex.getMessage(), ex); } @@ -233,7 +236,12 @@ public void removeApiToken(AuthenticatedUser user){ em.remove(apiToken); } } - } + } + + public boolean isOrcidEnabled() { + return oAuth2authenticationProviders.values().stream().anyMatch( s -> s.getId().toLowerCase().contains("orcid") ); + } + /** * Use with care! This method was written primarily for developers * interested in API testing who want to: @@ -317,7 +325,19 @@ public AuthenticatedUser getAuthenticatedUserByEmail( String email ) { } } - public AuthenticatedUser authenticate( String authenticationProviderId, AuthenticationRequest req ) throws AuthenticationFailedException { + /** + * Returns an {@link AuthenticatedUser} matching the passed provider id and the authentication request. If + * no such user exist, it is created and then returned. + * + * Invariant: upon successful return from this call, an {@link AuthenticatedUser} record + * matching the request and provider exists in the database. + * + * @param authenticationProviderId + * @param req + * @return The authenticated user for the passed provider id and authentication request. + * @throws AuthenticationFailedException + */ + public AuthenticatedUser getCreateAuthenticatedUser( String authenticationProviderId, AuthenticationRequest req ) throws AuthenticationFailedException { AuthenticationProvider prv = getAuthenticationProvider(authenticationProviderId); if ( prv == null ) throw new IllegalArgumentException("No authentication provider listed under id " + authenticationProviderId ); if ( ! (prv instanceof CredentialsAuthenticationProvider) ) { @@ -333,18 +353,16 @@ public AuthenticatedUser authenticate( String authenticationProviderId, Authenti user = userService.updateLastLogin(user); } - /** - * @todo Why does a method called "authenticate" have the potential - * to call "createAuthenticatedUser"? Isn't the creation of a user a - * different action than authenticating? - * - * @todo Wouldn't this be more readable with if/else rather than - * ternary? (please) - */ - return ( user == null ) ? - AuthenticationServiceBean.this.createAuthenticatedUser( - new UserRecordIdentifier(authenticationProviderId, resp.getUserId()), resp.getUserId(), resp.getUserDisplayInfo(), true ) - : (BuiltinAuthenticationProvider.PROVIDER_ID.equals(user.getAuthenticatedUserLookup().getAuthenticationProviderId())) ? user : updateAuthenticatedUser(user, resp.getUserDisplayInfo()); + if ( user == null ) { + return createAuthenticatedUser( + new UserRecordIdentifier(authenticationProviderId, resp.getUserId()), resp.getUserId(), resp.getUserDisplayInfo(), true ); + } else { + if (BuiltinAuthenticationProvider.PROVIDER_ID.equals(user.getAuthenticatedUserLookup().getAuthenticationProviderId())) { + return user; + } else { + return updateAuthenticatedUser(user, resp.getUserDisplayInfo()); + } + } } else { throw new AuthenticationFailedException(resp, "Authentication Failed: " + resp.getMessage()); } @@ -778,7 +796,7 @@ public AuthenticatedUser canLogInAsBuiltinUser(String username, String password) String credentialsAuthProviderId = BuiltinAuthenticationProvider.PROVIDER_ID; try { - AuthenticatedUser au = authenticate(credentialsAuthProviderId, authReq); + AuthenticatedUser au = getCreateAuthenticatedUser(credentialsAuthProviderId, authReq); logger.fine("User authenticated:" + au.getEmail()); return au; } catch (AuthenticationFailedException ex) { diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/builtin/DataverseUserPage.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/builtin/DataverseUserPage.java index 37c3a75cddd..df2db501035 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/builtin/DataverseUserPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/builtin/DataverseUserPage.java @@ -340,14 +340,14 @@ public String save() { // go back to where user came from if ("dataverse.xhtml".equals(redirectPage)) { - redirectPage = redirectPage + "&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = redirectPage + "?alias=" + dataverseService.findRootDataverse().getAlias(); } try { redirectPage = URLDecoder.decode(redirectPage, "UTF-8"); } catch (UnsupportedEncodingException ex) { logger.log(Level.SEVERE, "Server does not support 'UTF-8' encoding.", ex); - redirectPage = "dataverse.xhtml&alias=" + dataverseService.findRootDataverse().getAlias(); + redirectPage = "dataverse.xhtml?alias=" + dataverseService.findRootDataverse().getAlias(); } logger.log(Level.FINE, "Sending user to = {0}", redirectPage); diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/AbstractOAuth2AuthenticationProvider.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/AbstractOAuth2AuthenticationProvider.java index 5e89df5118e..8cfb84e7ce3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/AbstractOAuth2AuthenticationProvider.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/AbstractOAuth2AuthenticationProvider.java @@ -16,6 +16,7 @@ import java.util.List; import java.util.Objects; import java.util.Optional; +import java.util.logging.Level; import java.util.logging.Logger; /** @@ -90,8 +91,6 @@ public String toString() { protected String redirectUrl; protected String scope; - public AbstractOAuth2AuthenticationProvider(){} - public abstract BaseApi getApiInstance(); protected abstract ParsedUserResponse parseUserResponse( String responseBody ); @@ -111,6 +110,7 @@ public OAuth20Service getService(String state, String redirectUrl) { public OAuth2UserRecord getUserRecord(String code, String state, String redirectUrl) throws IOException, OAuth2Exception { OAuth20Service service = getService(state, redirectUrl); OAuth2AccessToken accessToken = service.getAccessToken(code); + final String userEndpoint = getUserEndpoint(accessToken); final OAuthRequest request = new OAuthRequest(Verb.GET, userEndpoint, service); @@ -120,13 +120,13 @@ public OAuth2UserRecord getUserRecord(String code, String state, String redirect final Response response = request.send(); int responseCode = response.getCode(); final String body = response.getBody(); - logger.fine("In getUserRecord. Body: " + body); + logger.log(Level.FINE, "In getUserRecord. Body: {0}", body); if ( responseCode == 200 ) { final ParsedUserResponse parsed = parseUserResponse(body); return new OAuth2UserRecord(getId(), parsed.userIdInProvider, parsed.username, - accessToken.getAccessToken(), + OAuth2TokenData.from(accessToken), parsed.displayInfo, parsed.emails); } else { diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2FirstLoginPage.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2FirstLoginPage.java index 9c29b4319b2..9ca92466465 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2FirstLoginPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2FirstLoginPage.java @@ -66,6 +66,9 @@ public class OAuth2FirstLoginPage implements java.io.Serializable { @EJB AuthTestDataServiceBean authTestDataSvc; + @EJB + OAuth2TokenDataServiceBean oauth2Tokens; + @Inject DataverseSession session; @@ -99,7 +102,7 @@ public void init() throws IOException { logger.fine("init called"); AbstractOAuth2AuthenticationProvider.DevOAuthAccountType devMode = systemConfig.getDevOAuthAccountType(); - logger.fine("devMode: " + devMode); + logger.log(Level.FINE, "devMode: {0}", devMode); if (!AbstractOAuth2AuthenticationProvider.DevOAuthAccountType.PRODUCTION.equals(devMode)) { if (devMode.toString().startsWith("RANDOM")) { Map randomUser = authTestDataSvc.getRandomUser(); @@ -136,7 +139,8 @@ public void init() throws IOException { } String randomUsername = randomUser.get("username"); String eppn = randomUser.get("eppn"); - String accessToken = "qwe-addssd-iiiiie"; + OAuth2TokenData accessToken = new OAuth2TokenData(); + accessToken.setAccessToken("qwe-addssd-iiiiie"); setNewUser(new OAuth2UserRecord(authProviderId, eppn, randomUsername, accessToken, new AuthenticatedUserDisplayInfo(firstName, lastName, email, "myAffiliation", "myPosition"), extraEmails)); @@ -185,7 +189,12 @@ public String createNewAccount() { userNotificationService.sendNotification(user, new Timestamp(new Date().getTime()), UserNotification.Type.CREATEACC, null); - + + final OAuth2TokenData tokenData = newUser.getTokenData(); + tokenData.setUser(user); + tokenData.setOauthProviderId(newUser.getServiceId()); + oauth2Tokens.store(tokenData); + return "/dataverse.xhtml?faces-redirect=true"; } @@ -196,13 +205,13 @@ public String convertExistingAccount() { auReq.putCredential(creds.get(0).getTitle(), getUsername()); auReq.putCredential(creds.get(1).getTitle(), getPassword()); try { - AuthenticatedUser existingUser = authenticationSvc.authenticate(BuiltinAuthenticationProvider.PROVIDER_ID, auReq); + AuthenticatedUser existingUser = authenticationSvc.getCreateAuthenticatedUser(BuiltinAuthenticationProvider.PROVIDER_ID, auReq); authenticationSvc.updateProvider(existingUser, newUser.getServiceId(), newUser.getIdInService()); builtinUserSvc.removeUser(existingUser.getUserIdentifier()); session.setUser(existingUser); - AuthenticationProvider authProvider = authenticationSvc.getAuthenticationProvider(newUser.getServiceId()); - JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("oauth2.convertAccount.success", Arrays.asList(authProvider.getInfo().getTitle()))); + AuthenticationProvider newUserAuthProvider = authenticationSvc.getAuthenticationProvider(newUser.getServiceId()); + JsfHelper.addSuccessMessage(BundleUtil.getStringFromBundle("oauth2.convertAccount.success", Arrays.asList(newUserAuthProvider.getInfo().getTitle()))); return "/dataverse.xhtml?faces-redirect=true"; @@ -212,22 +221,17 @@ public String convertExistingAccount() { } } - public String testAction() { - Logger.getLogger(OAuth2FirstLoginPage.class.getName()).log(Level.INFO, "testAction"); - return "dataverse.xhtml"; - } - public boolean isEmailAvailable() { return authenticationSvc.isEmailAddressAvailable(getSelectedEmail()); } - /** + /* * @todo This was copied from DataverseUserPage and modified so consider * consolidating common code (DRY). */ public void validateUserName(FacesContext context, UIComponent toValidate, Object value) { String userName = (String) value; - logger.fine("Validating username: " + userName); + logger.log(Level.FINE, "Validating username: {0}", userName); boolean userNameFound = authenticationSvc.identifierExists(userName); if (userNameFound) { ((UIInput) toValidate).setValid(false); @@ -236,7 +240,7 @@ public void validateUserName(FacesContext context, UIComponent toValidate, Objec } } - /** + /* * @todo This was copied from DataverseUserPage and modified so consider * consolidating common code (DRY). */ @@ -336,11 +340,7 @@ public String getCreateFromWhereTip() { public boolean isConvertFromBuiltinIsPossible() { AuthenticationProvider builtinAuthProvider = authenticationSvc.getAuthenticationProvider(BuiltinAuthenticationProvider.PROVIDER_ID); - if (builtinAuthProvider != null) { - return true; - } else { - return false; - } + return builtinAuthProvider != null; } public String getSuggestConvertInsteadOfCreate() { @@ -371,7 +371,7 @@ public List getEmailsToPickFrom() { } } } - logger.fine(emailsToPickFrom.size() + " emails to pick from: " + emailsToPickFrom); + logger.log(Level.FINE, "{0} emails to pick from: {1}", new Object[]{emailsToPickFrom.size(), emailsToPickFrom}); return emailsToPickFrom; } diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2LoginBackingBean.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2LoginBackingBean.java index 600e97b00bd..6fdc33b48b3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2LoginBackingBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2LoginBackingBean.java @@ -4,7 +4,6 @@ import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; import edu.harvard.iq.dataverse.authorization.UserRecordIdentifier; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; -import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.StringUtil; import java.io.BufferedReader; import java.io.IOException; @@ -26,8 +25,8 @@ import edu.harvard.iq.dataverse.util.SystemConfig; /** - * Backing bean of the oauth2 login process. Used from the login page and the - * callback page. + * Backing bean of the oauth2 login process. Used from the login and the + * callback pages. * * @author michael */ @@ -45,6 +44,9 @@ public class OAuth2LoginBackingBean implements Serializable { @EJB AuthenticationServiceBean authenticationSvc; + + @EJB + OAuth2TokenDataServiceBean oauth2Tokens; @EJB SystemConfig systemConfig; @@ -91,7 +93,7 @@ public void exchangeCodeForToken() throws IOException { oauthUser = idp.getUserRecord(code, state, getCallbackUrl()); UserRecordIdentifier idtf = oauthUser.getUserRecordIdentifier(); AuthenticatedUser dvUser = authenticationSvc.lookupUser(idtf); - + if (dvUser == null) { // need to create the user newAccountPage.setNewUser(oauthUser); @@ -100,6 +102,10 @@ public void exchangeCodeForToken() throws IOException { } else { // login the user and redirect to HOME of intended page (if any). session.setUser(dvUser); + final OAuth2TokenData tokenData = oauthUser.getTokenData(); + tokenData.setUser(dvUser); + tokenData.setOauthProviderId(idp.getId()); + oauth2Tokens.store(tokenData); String destination = redirectPage.orElse("/"); HttpServletResponse response = (HttpServletResponse) FacesContext.getCurrentInstance().getExternalContext().getResponse(); String prettyUrl = response.encodeRedirectURL(destination); diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenData.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenData.java new file mode 100644 index 00000000000..db29bae92bd --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenData.java @@ -0,0 +1,190 @@ +package edu.harvard.iq.dataverse.authorization.providers.oauth2; + +import com.github.scribejava.core.model.OAuth2AccessToken; +import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import java.io.Serializable; +import java.sql.Timestamp; +import javax.persistence.Column; +import javax.persistence.Entity; +import javax.persistence.GeneratedValue; +import javax.persistence.GenerationType; +import javax.persistence.Id; +import javax.persistence.ManyToOne; +import javax.persistence.NamedQueries; +import javax.persistence.NamedQuery; + +/** + * Token data for a given user, received from an OAuth2 system. Contains the + * user's access token for the remote system, as well as additional data, + * such as refresh token and expiry date. + * + * Persisting token data is a requirement for ORCID according to + * https://members.orcid.org/api/news/xsd-20-update which says "Store full + * responses from token exchange: access tokens, refresh tokens, scope, scope + * expiry to indicate an iD has been authenticated and with what scope" but we + * don't know how long responses need to be stored. There is no such requirement + * to store responses for any other OAuth provider. + * + * @author michael + */ +@NamedQueries({ + @NamedQuery( name="OAuth2TokenData.findByUserIdAndProviderId", + query = "SELECT d FROM OAuth2TokenData d WHERE d.user.id=:userId AND d.oauthProviderId=:providerId" ), + @NamedQuery( name="OAuth2TokenData.deleteByUserIdAndProviderId", + query = "DELETE FROM OAuth2TokenData d WHERE d.user.id=:userId AND d.oauthProviderId=:providerId" ) + +}) +@Entity +public class OAuth2TokenData implements Serializable { + + @Id + @GeneratedValue(strategy = GenerationType.IDENTITY) + private Long id; + + @ManyToOne + private AuthenticatedUser user; + + private String oauthProviderId; + + private Timestamp expiryDate; + + /** + * "Please don't put a maximum size on the storage for an access token" at + * https://stackoverflow.com/questions/4408945/what-is-the-length-of-the-access-token-in-facebook-oauth2/16365828#16365828 + */ + @Column(columnDefinition = "TEXT") + private String accessToken; + + @Column(length = 64) + private String refreshToken; + + @Column(length = 64) + private String scope; + + @Column(length = 32) + private String tokenType; + + @Column(columnDefinition = "TEXT") + private String rawResponse; + + + /** + * Creates a new {@link OAuth2TokenData} instance, based on the data in + * the passed {@link OAuth2AccessToken}. + * @param accessTokenResponse The token parsed by the ScribeJava library. + * @return A new, pre-populated {@link OAuth2TokenData}. + */ + public static OAuth2TokenData from( OAuth2AccessToken accessTokenResponse ) { + OAuth2TokenData retVal = new OAuth2TokenData(); + retVal.setAccessToken(accessTokenResponse.getAccessToken()); + retVal.setRefreshToken( accessTokenResponse.getRefreshToken() ); + retVal.setScope( accessTokenResponse.getScope() ); + retVal.setTokenType( accessTokenResponse.getTokenType() ); + if ( accessTokenResponse.getExpiresIn() != null ) { + retVal.setExpiryDate( new Timestamp( System.currentTimeMillis() + accessTokenResponse.getExpiresIn())); + } + retVal.setRawResponse( accessTokenResponse.getRawResponse() ); + + return retVal; + } + + public Long getId() { + return id; + } + + public void setId(Long id) { + this.id = id; + } + + public AuthenticatedUser getUser() { + return user; + } + + public void setUser(AuthenticatedUser user) { + this.user = user; + } + + public String getOauthProviderId() { + return oauthProviderId; + } + + public void setOauthProviderId(String oauthProviderId) { + this.oauthProviderId = oauthProviderId; + } + + public Timestamp getExpiryDate() { + return expiryDate; + } + + public void setExpiryDate(Timestamp expiryDate) { + this.expiryDate = expiryDate; + } + + public String getAccessToken() { + return accessToken; + } + + public void setAccessToken(String accessToken) { + this.accessToken = accessToken; + } + + public String getRefreshToken() { + return refreshToken; + } + + public void setRefreshToken(String refreshToken) { + this.refreshToken = refreshToken; + } + + public String getScope() { + return scope; + } + + public void setScope(String scope) { + this.scope = scope; + } + + public String getTokenType() { + return tokenType; + } + + public void setTokenType(String tokenType) { + this.tokenType = tokenType; + } + + public String getRawResponse() { + return rawResponse; + } + + public void setRawResponse(String rawResponse) { + this.rawResponse = rawResponse; + } + + @Override + public int hashCode() { + int hash = 5; + hash = 71 * hash + (int) (this.id ^ (this.id >>> 32)); + return hash; + } + + @Override + public boolean equals(Object obj) { + if (this == obj) { + return true; + } + if (obj == null) { + return false; + } + if (getClass() != obj.getClass()) { + return false; + } + final OAuth2TokenData other = (OAuth2TokenData) obj; + if (this.id != other.id) { + return false; + } + return true; + } + + + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenDataServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenDataServiceBean.java new file mode 100644 index 00000000000..d8f1fa7600b --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2TokenDataServiceBean.java @@ -0,0 +1,44 @@ +package edu.harvard.iq.dataverse.authorization.providers.oauth2; + +import java.util.List; +import java.util.Optional; +import javax.ejb.Stateless; +import javax.inject.Named; +import javax.persistence.EntityManager; +import javax.persistence.PersistenceContext; + +/** + * CRUD for {@link OAuth2TokenData}. + * + * @author michael + */ +@Stateless +public class OAuth2TokenDataServiceBean { + + @PersistenceContext + private EntityManager em; + + public void store( OAuth2TokenData tokenData ) { + if ( tokenData.getId() != null ) { + // token exists, this is an update + em.merge(tokenData); + + } else { + // ensure there's only one token for each user/service pair. + em.createNamedQuery("OAuth2TokenData.deleteByUserIdAndProviderId") + .setParameter("userId", tokenData.getUser().getId() ) + .setParameter("providerId", tokenData.getOauthProviderId() ) + .executeUpdate(); + em.persist( tokenData ); + } + } + + public Optional get( long authenticatedUserId, String serviceId ) { + final List tokens = em.createNamedQuery("OAuth2TokenData.findByUserIdAndProviderId", OAuth2TokenData.class) + .setParameter("userId", authenticatedUserId ) + .setParameter("providerId", serviceId ) + .getResultList(); + return Optional.ofNullable( tokens.isEmpty() ? null : tokens.get(0) ); + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2UserRecord.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2UserRecord.java index eca69a3697f..234c2828ab5 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2UserRecord.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/OAuth2UserRecord.java @@ -20,19 +20,19 @@ public class OAuth2UserRecord implements java.io.Serializable { /** A potentially mutable String that is easier on the eye than a number. */ private final String username; - private final String accessToken; - private final AuthenticatedUserDisplayInfo displayInfo; private final List availableEmailAddresses; + private final OAuth2TokenData tokenData; + public OAuth2UserRecord(String aServiceId, String anIdInService, String aUsername, - String anAccessToken, AuthenticatedUserDisplayInfo aDisplayInfo, + OAuth2TokenData someTokenData, AuthenticatedUserDisplayInfo aDisplayInfo, List someAvailableEmailAddresses) { serviceId = aServiceId; idInService = anIdInService; username = aUsername; - accessToken = anAccessToken; + tokenData = someTokenData; displayInfo = aDisplayInfo; availableEmailAddresses = someAvailableEmailAddresses; } @@ -49,10 +49,6 @@ public String getUsername() { return username; } - public String getAccessToken() { - return accessToken; - } - public List getAvailableEmailAddresses() { return availableEmailAddresses; } @@ -61,6 +57,10 @@ public AuthenticatedUserDisplayInfo getDisplayInfo() { return displayInfo; } + public OAuth2TokenData getTokenData() { + return tokenData; + } + @Override public String toString() { return "OAuth2UserRecord{" + "serviceId=" + serviceId + ", idInService=" + idInService + '}'; diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidApi.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidApi.java index 9fffe2171bc..d5f32e67bc0 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidApi.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidApi.java @@ -12,7 +12,7 @@ public class OrcidApi extends DefaultApi20 { /** - * The instance holder pattern allows for lazy creation of the intance. + * The instance holder pattern allows for lazy creation of the instance. */ private static class SandboxInstanceHolder { private static final OrcidApi INSTANCE = diff --git a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidOAuth2AP.java b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidOAuth2AP.java index c56fe77bcf0..dbc4b0ac4e6 100644 --- a/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidOAuth2AP.java +++ b/src/main/java/edu/harvard/iq/dataverse/authorization/providers/oauth2/impl/OrcidOAuth2AP.java @@ -2,10 +2,16 @@ import com.github.scribejava.core.builder.api.BaseApi; import com.github.scribejava.core.model.OAuth2AccessToken; +import com.github.scribejava.core.model.OAuthRequest; +import com.github.scribejava.core.model.Response; +import com.github.scribejava.core.model.Verb; +import com.github.scribejava.core.oauth.OAuth20Service; import edu.harvard.iq.dataverse.authorization.AuthenticatedUserDisplayInfo; import edu.harvard.iq.dataverse.authorization.AuthenticationProviderDisplayInfo; import edu.harvard.iq.dataverse.authorization.providers.oauth2.AbstractOAuth2AuthenticationProvider; import edu.harvard.iq.dataverse.authorization.providers.oauth2.OAuth2Exception; +import edu.harvard.iq.dataverse.authorization.providers.oauth2.OAuth2TokenData; +import edu.harvard.iq.dataverse.authorization.providers.oauth2.OAuth2UserRecord; import edu.harvard.iq.dataverse.util.BundleUtil; import java.io.IOException; import java.io.StringReader; @@ -13,12 +19,15 @@ import java.util.Arrays; import java.util.Collections; import java.util.List; +import java.util.Objects; import java.util.logging.Level; import java.util.logging.Logger; import java.util.stream.Collectors; +import static java.util.stream.Collectors.joining; import java.util.stream.IntStream; import java.util.stream.Stream; import javax.json.Json; +import javax.json.JsonObject; import javax.json.JsonReader; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; @@ -28,11 +37,17 @@ import org.w3c.dom.NodeList; import org.xml.sax.InputSource; import org.xml.sax.SAXException; +import javax.xml.xpath.XPathFactory; +import javax.xml.xpath.XPath; +import javax.xml.xpath.XPathConstants; +import javax.xml.xpath.XPathExpression; /** * OAuth2 identity provider for ORCiD. Note that ORCiD has two systems: sandbox * and production. Hence having the user endpoint as a parameter. + * * @author michael + * @author pameyer */ public class OrcidOAuth2AP extends AbstractOAuth2AuthenticationProvider { @@ -61,41 +76,69 @@ public String getUserEndpoint( OAuth2AccessToken token ) { public BaseApi getApiInstance() { return OrcidApi.instance( ! baseUserEndpoint.contains("sandbox") ); } + + @Override + public OAuth2UserRecord getUserRecord(String code, String state, String redirectUrl) throws IOException, OAuth2Exception { + OAuth20Service service = getService(state, redirectUrl); + OAuth2AccessToken accessToken = service.getAccessToken(code); + + if ( ! accessToken.getScope().contains(scope) ) { + // We did not get the permissions on the scope we need. Abort and inform the user. + throw new OAuth2Exception(200, BundleUtil.getStringFromBundle("auth.providers.orcid.insufficientScope"), ""); + } + + String orcidNumber = extractOrcidNumber(accessToken.getRawResponse()); + + final String userEndpoint = getUserEndpoint(accessToken); + + final OAuthRequest request = new OAuthRequest(Verb.GET, userEndpoint, service); + request.addHeader("Authorization", "Bearer " + accessToken.getAccessToken()); + request.setCharset("UTF-8"); + + final Response response = request.send(); + int responseCode = response.getCode(); + final String body = response.getBody(); + logger.log(Level.FINE, "In getUserRecord. Body: {0}", body); + if ( responseCode == 200 ) { + final ParsedUserResponse parsed = parseUserResponse(body); + AuthenticatedUserDisplayInfo orgData = getOrganizationalData(userEndpoint, accessToken.getAccessToken(), service); + parsed.displayInfo.setAffiliation(orgData.getAffiliation()); + parsed.displayInfo.setPosition(orgData.getPosition()); + + return new OAuth2UserRecord(getId(), orcidNumber, + parsed.username, + OAuth2TokenData.from(accessToken), + parsed.displayInfo, + parsed.emails); + } else { + throw new OAuth2Exception(responseCode, body, "Error getting the user info record."); + } + } + @Override protected ParsedUserResponse parseUserResponse(String responseBody) { DocumentBuilderFactory dbFact = DocumentBuilderFactory.newInstance(); try ( StringReader reader = new StringReader(responseBody)) { DocumentBuilder db = dbFact.newDocumentBuilder(); Document doc = db.parse( new InputSource(reader) ); - List orcidIdNodeList = getNodes(doc, "orcid-message", "orcid-profile","orcid-identifier","path"); - if ( orcidIdNodeList.size() != 1 ) { - throw new OAuth2Exception(0, responseBody, "Cannot find ORCiD id in response."); - } - String orcidId = orcidIdNodeList.get(0).getTextContent().trim(); - String firstName = getNodes(doc, "orcid-message", "orcid-profile", "orcid-bio", "personal-details", "given-names" ) + + String firstName = getNodes(doc, "person:person", "person:name", "personal-details:given-names" ) .stream().findFirst().map( Node::getTextContent ) .map( String::trim ).orElse(""); - String familyName = getNodes(doc, "orcid-message", "orcid-profile", "orcid-bio", "personal-details", "family-name" ) + String familyName = getNodes(doc, "person:person", "person:name", "personal-details:family-name") .stream().findFirst().map( Node::getTextContent ) .map( String::trim ).orElse(""); - String affiliation = getNodes(doc, "orcid-message", "orcid-profile", "orcid-activities", "affiliations", "affiliation", "organization", "name" ) + + // fallback - try to use the credit-name + if ( (firstName + familyName).equals("") ) { + firstName = getNodes(doc, "person:person", "person:name", "personal-details:credit-name" ) .stream().findFirst().map( Node::getTextContent ) .map( String::trim ).orElse(""); - List emails = new ArrayList<>(); - getNodes(doc, "orcid-message", "orcid-profile", "orcid-bio","contact-details","email").forEach( n ->{ - String email = n.getTextContent().trim(); - Node primaryAtt = n.getAttributes().getNamedItem("primary"); - boolean isPrimary = (primaryAtt!=null) && - (primaryAtt.getTextContent()!=null) && - (primaryAtt.getTextContent().trim().toLowerCase().equals("true")); - if ( isPrimary ) { - emails.add(0, email); - } else { - emails.add(email); - } - }); - String primaryEmail = (emails.size()>1) ? emails.get(0) : ""; + } + + String primaryEmail = getPrimaryEmail(doc); + List emails = getAllEmails(doc); // make the username up String username; @@ -104,9 +147,13 @@ protected ParsedUserResponse parseUserResponse(String responseBody) { } else { username = firstName.split(" ")[0] + "." + familyName; } + username = username.replaceAll("[^a-zA-Z0-9.]",""); + // returning the parsed user. The user-id-in-provider will be added by the caller, since ORCiD passes it + // on the access token response. + // Affilifation added after a later call. final ParsedUserResponse userResponse = new ParsedUserResponse( - new AuthenticatedUserDisplayInfo(firstName, familyName, primaryEmail, affiliation, ""), orcidId, username); + new AuthenticatedUserDisplayInfo(firstName, familyName, primaryEmail, "", ""), null, username); userResponse.emails.addAll(emails); return userResponse; @@ -117,8 +164,6 @@ protected ParsedUserResponse parseUserResponse(String responseBody) { logger.log(Level.SEVERE, "I/O error parsing response body from ORCiD: " + ex.getMessage(), ex); } catch (ParserConfigurationException ex) { logger.log(Level.SEVERE, "While parsing the ORCiD response: Bad parse configuration. " + ex.getMessage(), ex); - } catch (OAuth2Exception ex) { - logger.log(Level.SEVERE, "Semantic error parsing response body from ORCiD: " + ex.getMessage(), ex); } return null; @@ -146,6 +191,52 @@ private List getNodes( Node node, List path ) { } } + + /** + * retrieve email from ORCID 2.0 response document, or empty string if no primary email is present + */ + private String getPrimaryEmail(Document doc) { + // `xmlstarlet sel -t -c "/record:record/person:person/email:emails/email:email[@primary='true']/email:email"`, if you're curious + String p = "/person/emails/email[@primary='true']/email/text()"; + NodeList emails = xpathMatches( doc, p ); + String primaryEmail = ""; + if ( 1 == emails.getLength() ) { + primaryEmail = emails.item(0).getTextContent(); + } + // if there are no (or somehow more than 1) primary email(s), then we've already at failure value + return primaryEmail; + } + + /** + * retrieve all emails (including primary) from ORCID 2.0 response document + */ + private List getAllEmails(Document doc) { + String p = "/person/emails/email/email/text()"; + NodeList emails = xpathMatches( doc, p ); + List rs = new ArrayList<>(); + for(int i=0;i storageIO, return false; } } catch (FileNotFoundException fnfe) { - logger.fine("No .img file for this worldmap file yet; giving up."); + logger.fine("No .img file for this worldmap file yet; giving up. Original Error: " + fnfe); return false; } catch (IOException ioex) { - logger.warning("caught IOException trying to open an input stream for worldmap .img file (" + storageIO.getDataFile().getStorageIdentifier() + ")"); + logger.warning("caught IOException trying to open an input stream for worldmap .img file (" + storageIO.getDataFile().getStorageIdentifier() + "). Original Error: " + ioex); return false; } diff --git a/src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java b/src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java index 4729051e1ba..ac4af2e01dd 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java @@ -1,8 +1,10 @@ package edu.harvard.iq.dataverse.dataaccess; import com.amazonaws.AmazonClientException; +import com.amazonaws.HttpMethod; import com.amazonaws.SdkClientException; import com.amazonaws.auth.AWSCredentials; +import com.amazonaws.auth.AWSCredentialsProvider; import com.amazonaws.auth.AWSStaticCredentialsProvider; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.regions.Regions; @@ -14,10 +16,12 @@ import com.amazonaws.services.s3.model.DeleteObjectRequest; import com.amazonaws.services.s3.model.DeleteObjectsRequest; import com.amazonaws.services.s3.model.DeleteObjectsRequest.KeyVersion; +import com.amazonaws.services.s3.model.GeneratePresignedUrlRequest; import com.amazonaws.services.s3.model.GetObjectRequest; import com.amazonaws.services.s3.model.ListObjectsRequest; import com.amazonaws.services.s3.model.MultiObjectDeleteException; import com.amazonaws.services.s3.model.ObjectListing; +import com.amazonaws.services.s3.model.ResponseHeaderOverrides; import com.amazonaws.services.s3.model.S3Object; import com.amazonaws.services.s3.model.S3ObjectSummary; import edu.harvard.iq.dataverse.DataFile; @@ -25,18 +29,25 @@ import edu.harvard.iq.dataverse.Dataverse; import edu.harvard.iq.dataverse.DvObject; import edu.harvard.iq.dataverse.datavariable.DataVariable; +import edu.harvard.iq.dataverse.util.FileUtil; +import java.io.ByteArrayInputStream; import java.io.File; +import java.io.FileInputStream; import java.io.FileNotFoundException; +import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; +import java.net.URL; import java.nio.channels.Channel; import java.nio.channels.Channels; import java.nio.channels.WritableByteChannel; import java.nio.file.Path; +import java.nio.file.Paths; import java.util.ArrayList; import java.util.Date; import java.util.List; +import java.util.Random; import java.util.logging.Logger; import org.apache.commons.io.IOUtils; @@ -67,20 +78,16 @@ public S3AccessIO(T dvObject, DataAccessRequest req) { super(dvObject, req); this.setIsLocalFile(false); try { - awsCredentials = new ProfileCredentialsProvider().getCredentials(); - s3 = AmazonS3ClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(awsCredentials)).withRegion(Regions.US_EAST_1).build(); + s3 = AmazonS3ClientBuilder.standard().defaultClient(); } catch (Exception e) { throw new AmazonClientException( - "Cannot load the credentials from the credential profiles file. " - + "Please make sure that your credentials file is at the correct " - + "location (~/.aws/credentials), and is in valid format.", + "Cannot instantiate a S3 client using AWS SDK defaults for credentials and region", e); } } public static String S3_IDENTIFIER_PREFIX = "s3"; - private AWSCredentials awsCredentials = null; private AmazonS3 s3 = null; private String bucketName = System.getProperty("dataverse.files.s3-bucket-name"); private String key; @@ -186,6 +193,7 @@ public void savePath(Path fileSystemPath) throws IOException { File inputFile = fileSystemPath.toFile(); if (dvObject instanceof DataFile) { s3.putObject(new PutObjectRequest(bucketName, key, inputFile)); + newFileSize = inputFile.length(); } else { throw new IOException("DvObject type other than datafile is not yet supported"); @@ -205,6 +213,25 @@ public void savePath(Path fileSystemPath) throws IOException { setSize(newFileSize); } + /** + * Implements the StorageIO saveInputStream() method. + * This implementation is somewhat problematic, because S3 cannot save an object of + * an unknown length. This effectively nullifies any benefits of streaming; + * as we cannot start saving until we have read the entire stream. + * One way of solving this would be to buffer the entire stream as byte[], + * in memory, then save it... Which of course would be limited by the amount + * of memory available, and thus would not work for streams larger than that. + * So we have eventually decided to save save the stream to a temp file, then + * save to S3. This is slower, but guaranteed to work on any size stream. + * An alternative we may want to consider is to not implement this method + * in the S3 driver, and make it throw the UnsupportedDataAccessOperationException, + * similarly to how we handle attempts to open OutputStreams, in this and the + * Swift driver. + * + * @param inputStream InputStream we want to save + * @param auxItemTag String representing this Auxiliary type ("extension") + * @throws IOException if anything goes wrong. + */ @Override public void saveInputStream(InputStream inputStream, Long filesize) throws IOException { if (filesize == null || filesize < 0) { @@ -235,24 +262,23 @@ public void saveInputStream(InputStream inputStream) throws IOException { if (!this.canWrite()) { open(DataAccessOption.WRITE_ACCESS); } - //TODO? Copying over the object to a byte array is farily inefficient. - // We need the length of the data to upload inputStreams (see our putObject calls). - // There may be ways to work around this, see https://github.com/aws/aws-sdk-java/issues/474 to start. - // This is out of scope of creating the S3 driver and referenced in issue #4064! - byte[] bytes = IOUtils.toByteArray(inputStream); - long length = bytes.length; - ObjectMetadata metadata = new ObjectMetadata(); - metadata.setContentLength(length); + String directoryString = FileUtil.getFilesTempDirectory(); + + Random rand = new Random(); + Path tempPath = Paths.get(directoryString, Integer.toString(rand.nextInt(Integer.MAX_VALUE))); + File tempFile = createTempFile(tempPath, inputStream); + try { - s3.putObject(bucketName, key, inputStream, metadata); + s3.putObject(bucketName, key, tempFile); } catch (SdkClientException ioex) { String failureMsg = ioex.getMessage(); if (failureMsg == null) { failureMsg = "S3AccessIO: Unknown exception occured while uploading a local file into S3 Storage."; } - + tempFile.delete(); throw new IOException(failureMsg); } + tempFile.delete(); setSize(s3.getObjectMetadata(bucketName, key).getContentLength()); } @@ -336,7 +362,7 @@ public void savePathAsAux(Path fileSystemPath, String auxItemTag) throws IOExcep String destinationKey = getDestinationKey(auxItemTag); try { File inputFile = fileSystemPath.toFile(); - s3.putObject(new PutObjectRequest(bucketName, destinationKey, inputFile)); + s3.putObject(new PutObjectRequest(bucketName, destinationKey, inputFile)); } catch (AmazonClientException ase) { logger.warning("Caught an AmazonServiceException in S3AccessIO.savePathAsAux(): " + ase.getMessage()); throw new IOException("S3AccessIO: Failed to save path as an auxiliary object."); @@ -367,31 +393,71 @@ public void saveInputStreamAsAux(InputStream inputStream, String auxItemTag, Lon } } - //todo: add new method with size? - //or just check the data file content size? - // this method copies a local InputStream into this DataAccess Auxiliary location: + /** + * Implements the StorageIO saveInputStreamAsAux() method. + * This implementation is problematic, because S3 cannot save an object of + * an unknown length. This effectively nullifies any benefits of streaming; + * as we cannot start saving until we have read the entire stream. + * One way of solving this would be to buffer the entire stream as byte[], + * in memory, then save it... Which of course would be limited by the amount + * of memory available, and thus would not work for streams larger than that. + * So we have eventually decided to save save the stream to a temp file, then + * save to S3. This is slower, but guaranteed to work on any size stream. + * An alternative we may want to consider is to not implement this method + * in the S3 driver, and make it throw the UnsupportedDataAccessOperationException, + * similarly to how we handle attempts to open OutputStreams, in this and the + * Swift driver. + * + * @param inputStream InputStream we want to save + * @param auxItemTag String representing this Auxiliary type ("extension") + * @throws IOException if anything goes wrong. + */ @Override public void saveInputStreamAsAux(InputStream inputStream, String auxItemTag) throws IOException { if (!this.canWrite()) { open(DataAccessOption.WRITE_ACCESS); } + + String directoryString = FileUtil.getFilesTempDirectory(); + + Random rand = new Random(); + String pathNum = Integer.toString(rand.nextInt(Integer.MAX_VALUE)); + Path tempPath = Paths.get(directoryString, pathNum); + File tempFile = createTempFile(tempPath, inputStream); + String destinationKey = getDestinationKey(auxItemTag); - byte[] bytes = IOUtils.toByteArray(inputStream); - long length = bytes.length; - ObjectMetadata metadata = new ObjectMetadata(); - metadata.setContentLength(length); + try { - s3.putObject(bucketName, destinationKey, inputStream, metadata); + s3.putObject(bucketName, destinationKey, tempFile); } catch (SdkClientException ioex) { String failureMsg = ioex.getMessage(); if (failureMsg == null) { failureMsg = "S3AccessIO: Unknown exception occured while saving a local InputStream as S3Object"; } + tempFile.delete(); throw new IOException(failureMsg); } + tempFile.delete(); } - + + //Helper method for supporting saving streams with unknown length to S3 + //We save those streams to a file and then upload the file + private File createTempFile(Path path, InputStream inputStream) throws IOException { + + File targetFile = new File(path.toUri()); //File needs a name + OutputStream outStream = new FileOutputStream(targetFile); + + byte[] buffer = new byte[8 * 1024]; + int bytesRead; + while ((bytesRead = inputStream.read(buffer)) != -1) { + outStream.write(buffer, 0, bytesRead); + } + IOUtils.closeQuietly(inputStream); + IOUtils.closeQuietly(outStream); + return targetFile; + } + @Override public List listAuxObjects() throws IOException { if (!this.canWrite()) { @@ -405,7 +471,7 @@ public List listAuxObjects() throws IOException { List storedAuxFilesSummary = storedAuxFilesList.getObjectSummaries(); try { while (storedAuxFilesList.isTruncated()) { - logger.fine("S3 listAuxObjects: going to second page of list"); + logger.fine("S3 listAuxObjects: going to next page of list"); storedAuxFilesList = s3.listNextBatchOfObjects(storedAuxFilesList); storedAuxFilesSummary.addAll(storedAuxFilesList.getObjectSummaries()); } @@ -416,7 +482,7 @@ public List listAuxObjects() throws IOException { for (S3ObjectSummary item : storedAuxFilesSummary) { String destinationKey = item.getKey(); - String fileName = destinationKey.substring(destinationKey.lastIndexOf("/")); + String fileName = destinationKey.substring(destinationKey.lastIndexOf(".") + 1); logger.fine("S3 cached aux object fileName: " + fileName); ret.add(fileName); } @@ -534,6 +600,9 @@ private String getDestinationKey(String auxItemTag) throws IOException { if (dvObject instanceof DataFile) { return getMainFileKey() + "." + auxItemTag; } else if (dvObject instanceof Dataset) { + if (key == null) { + open(); + } return key + "/" + auxItemTag; } else { throw new IOException("S3AccessIO: This operation is only supported for Datasets and DataFiles."); @@ -559,4 +628,57 @@ private String getMainFileKey() throws IOException { return key; } + + public String generateTemporaryS3Url() throws IOException { + //Questions: + // Q. Should this work for private and public? + // A. Yes! Since the URL has a limited, short life span. -- L.A. + // Q. how long should the download url work? + // A. 1 hour by default seems like an OK number. Making it configurable seems like a good idea too. -- L.A. + if (s3 == null) { + throw new IOException("ERROR: s3 not initialised. "); + } + if (dvObject instanceof DataFile) { + key = getMainFileKey(); + java.util.Date expiration = new java.util.Date(); + long msec = expiration.getTime(); + msec += 1000 * getUrlExpirationMinutes(); + expiration.setTime(msec); + + GeneratePresignedUrlRequest generatePresignedUrlRequest = + new GeneratePresignedUrlRequest(bucketName, key); + generatePresignedUrlRequest.setMethod(HttpMethod.GET); // Default. + generatePresignedUrlRequest.setExpiration(expiration); + ResponseHeaderOverrides responseHeaders = new ResponseHeaderOverrides(); + responseHeaders.setContentDisposition("attachment; filename="+this.getDataFile().getDisplayName()); + responseHeaders.setContentType(this.getDataFile().getContentType()); + generatePresignedUrlRequest.setResponseHeaders(responseHeaders); + + URL s = s3.generatePresignedUrl(generatePresignedUrlRequest); + + return s.toString(); + } else if (dvObject instanceof Dataset) { + throw new IOException("Data Access: GenerateTemporaryS3Url: Invalid DvObject type : Dataset"); + } else if (dvObject instanceof Dataverse) { + throw new IOException("Data Access: Invalid DvObject type : Dataverse"); + } else { + throw new IOException("Data Access: Invalid DvObject type"); + } + } + + private int getUrlExpirationMinutes() { + String optionValue = System.getProperty("dataverse.files.s3-url-expiration-minutes"); + if (optionValue != null) { + Integer num; + try { + num = new Integer(optionValue); + } catch (NumberFormatException ex) { + num = null; + } + if (num != null) { + return num; + } + } + return 60; + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/dataaccess/StorageIO.java b/src/main/java/edu/harvard/iq/dataverse/dataaccess/StorageIO.java index 00e1222c68d..0e53430f5ba 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataaccess/StorageIO.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataaccess/StorageIO.java @@ -106,6 +106,27 @@ public boolean canWrite() { public abstract void savePath(Path fileSystemPath) throws IOException; // same, for an InputStream: + /** + * This method copies a local InputStream into this DataAccess location. + * Note that the S3 driver implementation of this abstract method is problematic, + * because S3 cannot save an object of an unknown length. This effectively + * nullifies any benefits of streaming; as we cannot start saving until we + * have read the entire stream. + * One way of solving this would be to buffer the entire stream as byte[], + * in memory, then save it... Which of course would be limited by the amount + * of memory available, and thus would not work for streams larger than that. + * So we have eventually decided to save save the stream to a temp file, then + * save to S3. This is slower, but guaranteed to work on any size stream. + * An alternative we may want to consider is to not implement this method + * in the S3 driver, and make it throw the UnsupportedDataAccessOperationException, + * similarly to how we handle attempts to open OutputStreams, in this and the + * Swift driver. + * (Not an issue in either FileAccessIO or SwiftAccessIO implementations) + * + * @param inputStream InputStream we want to save + * @param auxItemTag String representing this Auxiliary type ("extension") + * @throws IOException if anything goes wrong. + */ public abstract void saveInputStream(InputStream inputStream) throws IOException; public abstract void saveInputStream(InputStream inputStream, Long filesize) throws IOException; @@ -133,7 +154,27 @@ public boolean canWrite() { // this method copies a local filesystem Path into this DataAccess Auxiliary location: public abstract void savePathAsAux(Path fileSystemPath, String auxItemTag) throws IOException; - // this method copies a local InputStream into this DataAccess Auxiliary location: + /** + * This method copies a local InputStream into this DataAccess Auxiliary location. + * Note that the S3 driver implementation of this abstract method is problematic, + * because S3 cannot save an object of an unknown length. This effectively + * nullifies any benefits of streaming; as we cannot start saving until we + * have read the entire stream. + * One way of solving this would be to buffer the entire stream as byte[], + * in memory, then save it... Which of course would be limited by the amount + * of memory available, and thus would not work for streams larger than that. + * So we have eventually decided to save save the stream to a temp file, then + * save to S3. This is slower, but guaranteed to work on any size stream. + * An alternative we may want to consider is to not implement this method + * in the S3 driver, and make it throw the UnsupportedDataAccessOperationException, + * similarly to how we handle attempts to open OutputStreams, in this and the + * Swift driver. + * (Not an issue in either FileAccessIO or SwiftAccessIO implementations) + * + * @param inputStream InputStream we want to save + * @param auxItemTag String representing this Auxiliary type ("extension") + * @throws IOException if anything goes wrong. + */ public abstract void saveInputStreamAsAux(InputStream inputStream, String auxItemTag) throws IOException; public abstract void saveInputStreamAsAux(InputStream inputStream, String auxItemTag, Long filesize) throws IOException; diff --git a/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java b/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java index 101c5fb7804..7bdffed6281 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java @@ -267,7 +267,7 @@ public static Dataset persistDatasetLogoToStorageAndCreateThumbnail(Dataset data StorageIO dataAccess = null; try{ - dataAccess = DataAccess.createNewStorageIO(dataset,"file"); + dataAccess = DataAccess.createNewStorageIO(dataset,"placeholder"); } catch(IOException ioex){ //TODO: Add a suitable waing message diff --git a/src/main/java/edu/harvard/iq/dataverse/datasetutility/TwoRavensHelper.java b/src/main/java/edu/harvard/iq/dataverse/datasetutility/TwoRavensHelper.java deleted file mode 100644 index 76e740a5f90..00000000000 --- a/src/main/java/edu/harvard/iq/dataverse/datasetutility/TwoRavensHelper.java +++ /dev/null @@ -1,323 +0,0 @@ -/* - * To change this license header, choose License Headers in Project Properties. - * To change this template file, choose Tools | Templates - * and open the template in the editor. - */ -package edu.harvard.iq.dataverse.datasetutility; - -import edu.harvard.iq.dataverse.Dataset; -import edu.harvard.iq.dataverse.DataverseSession; -import edu.harvard.iq.dataverse.DvObject; -import edu.harvard.iq.dataverse.FileMetadata; -import edu.harvard.iq.dataverse.PermissionServiceBean; -import edu.harvard.iq.dataverse.SettingsWrapper; -import edu.harvard.iq.dataverse.authorization.AuthenticationServiceBean; -import edu.harvard.iq.dataverse.authorization.Permission; -import edu.harvard.iq.dataverse.authorization.users.ApiToken; -import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; -import edu.harvard.iq.dataverse.authorization.users.GuestUser; -import edu.harvard.iq.dataverse.authorization.users.User; -import edu.harvard.iq.dataverse.settings.SettingsServiceBean; -import java.util.HashMap; -import java.util.Map; -import javax.faces.view.ViewScoped; -import javax.inject.Inject; -import javax.inject.Named; - -/** - * - * @author rmp553 - - */ -@ViewScoped -@Named -public class TwoRavensHelper implements java.io.Serializable { - - @Inject SettingsWrapper settingsWrapper; - @Inject PermissionServiceBean permissionService; - @Inject AuthenticationServiceBean authService; - - @Inject - DataverseSession session; - - private final Map fileMetadataTwoRavensExploreMap = new HashMap<>(); // { FileMetadata.id : Boolean } - - public TwoRavensHelper(){ - - } - - - /** - * Call this from a Dataset or File page - * - calls private method canSeeTwoRavensExploreButton - * - * WARNING: Before calling this, make sure the user has download - * permission for the file!! (See DatasetPage.canDownloadFile()) - * - * @param fm - * @return - */ - public boolean canSeeTwoRavensExploreButtonFromAPI(FileMetadata fm, User user){ - - if (fm == null){ - return false; - } - - if (user == null){ - return false; - } - - if (!this.permissionService.userOn(user, fm.getDataFile()).has(Permission.DownloadFile)){ - return false; - } - - return this.canSeeTwoRavensExploreButton(fm, true); - } - - /** - * Call this from a Dataset or File page - * - calls private method canSeeTwoRavensExploreButton - * - * WARNING: Before calling this, make sure the user has download - * permission for the file!! (See DatasetPage.canDownloadFile()) - * - * @param fm - * @return - */ - public boolean canSeeTwoRavensExploreButtonFromPage(FileMetadata fm){ - - if (fm == null){ - return false; - } - - return this.canSeeTwoRavensExploreButton(fm, true); - } - - /** - * Used to check whether a tabular file - * may be viewed via TwoRavens - * - * @param fm - * @return - */ - public boolean canSeeTwoRavensExploreButton(FileMetadata fm, boolean permissionsChecked){ - if (fm == null){ - return false; - } - - // This is only here as a reminder to the public method users - if (!permissionsChecked){ - return false; - } - - if (!fm.getDataFile().isTabularData()){ - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - - // Has this already been checked? - if (this.fileMetadataTwoRavensExploreMap.containsKey(fm.getId())){ - // Yes, return previous answer - //logger.info("using cached result for candownloadfile on filemetadata "+fid); - return this.fileMetadataTwoRavensExploreMap.get(fm.getId()); - } - - - // (1) Is TwoRavens active via the "setting" table? - // Nope: get out - // - if (!settingsWrapper.isTrueForKey(SettingsServiceBean.Key.TwoRavensTabularView, false)){ - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - - //---------------------------------------------------------------------- - //(1a) Before we do any testing - if version is deaccessioned and user - // does not have edit dataset permission then may download - //--- - - // (2) Is the DataFile object there and persisted? - // Nope: scat - // - if ((fm.getDataFile() == null)||(fm.getDataFile().getId()==null)){ - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - - if (fm.getDatasetVersion().isDeaccessioned()) { - if (this.doesSessionUserHavePermission( Permission.EditDataset, fm)) { - // Yes, save answer and return true - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), true); - return true; - } else { - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - } - - - - - //Check for restrictions - - boolean isRestrictedFile = fm.isRestricted(); - - - // -------------------------------------------------------------------- - // Conditions (2) through (4) are for Restricted files - // -------------------------------------------------------------------- - - - if (isRestrictedFile && session.getUser() instanceof GuestUser){ - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - - - // -------------------------------------------------------------------- - // (3) Does the User have DownloadFile Permission at the **Dataset** level - // -------------------------------------------------------------------- - - - if (isRestrictedFile && !this.doesSessionUserHavePermission(Permission.DownloadFile, fm)){ - // Yes, save answer and return true - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - } - - // (3) Is there tabular data or is the ingest in progress? - // Yes: great - // - if ((fm.getDataFile().isTabularData())||(fm.getDataFile().isIngestInProgress())){ - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), true); - return true; - } - - // Nope - this.fileMetadataTwoRavensExploreMap.put(fm.getId(), false); - return false; - - // (empty fileMetadata.dataFile.id) and (fileMetadata.dataFile.tabularData or fileMetadata.dataFile.ingestInProgress) - // and DatasetPage.canDownloadFile(fileMetadata) - } - - - /** - * Copied over from the dataset page - 9/21/2016 - * - * @return - */ - public String getDataExploreURL() { - String TwoRavensUrl = settingsWrapper.getValueForKey(SettingsServiceBean.Key.TwoRavensUrl); - - if (TwoRavensUrl != null && !TwoRavensUrl.equals("")) { - return TwoRavensUrl; - } - - return ""; - } - - - /** - * Copied over from the dataset page - 9/21/2016 - * - * @param fileid - * @param apiTokenKey - * @return - */ - public String getDataExploreURLComplete(Long fileid) { - if (fileid == null){ - throw new NullPointerException("fileid cannot be null"); - } - - - String TwoRavensUrl = settingsWrapper.getValueForKey(SettingsServiceBean.Key.TwoRavensUrl); - String TwoRavensDefaultLocal = "/dataexplore/gui.html?dfId="; - - if (TwoRavensUrl != null && !TwoRavensUrl.equals("")) { - // If we have TwoRavensUrl set up as, as an optional - // configuration service, it must mean that TwoRavens is sitting - // on some remote server. And that in turn means that we must use - // full URLs to pass data and metadata to it. - // update: actually, no we don't want to use this "dataurl" notation. - // switching back to the dfId=: - // -- L.A. 4.1 - /* - String tabularDataURL = getTabularDataFileURL(fileid); - String tabularMetaURL = getVariableMetadataURL(fileid); - return TwoRavensUrl + "?ddiurl=" + tabularMetaURL + "&dataurl=" + tabularDataURL + "&" + getApiTokenKey(); - */ - System.out.print("TwoRavensUrl Set up " + TwoRavensUrl + "?dfId=" + fileid + "&" + getApiTokenKey()); - - return TwoRavensUrl + "?dfId=" + fileid + "&" + getApiTokenKey(); - } - - // For a local TwoRavens setup it's enough to call it with just - // the file id: - return TwoRavensDefaultLocal + fileid + "&" + getApiTokenKey(); - } - - private String getApiTokenKey() { - ApiToken apiToken; - if (session.getUser() == null) { - return null; - } - if (isSessionUserAuthenticated()) { - AuthenticatedUser au = (AuthenticatedUser) session.getUser(); - apiToken = authService.findApiTokenByUser(au); - if (apiToken != null) { - return "key=" + apiToken.getTokenString(); - } - // Generate if not available? - // Or should it just be generated inside the authService - // automatically? - apiToken = authService.generateApiTokenForUser(au); - if (apiToken != null) { - return "key=" + apiToken.getTokenString(); - } - } - return ""; - - } - - public boolean isSessionUserAuthenticated() { - - if (session == null) { - return false; - } - - if (session.getUser() == null) { - return false; - } - - return session.getUser().isAuthenticated(); - - } - - public boolean doesSessionUserHavePermission(Permission permissionToCheck, FileMetadata fileMetadata){ - if (permissionToCheck == null){ - return false; - } - - DvObject objectToCheck = null; - - if (permissionToCheck.equals(Permission.EditDataset)){ - objectToCheck = fileMetadata.getDatasetVersion().getDataset(); - } else if (permissionToCheck.equals(Permission.DownloadFile)){ - objectToCheck = fileMetadata.getDataFile(); - } - - if (objectToCheck == null){ - return false; - } - - - // Check the permission - // - boolean hasPermission = this.permissionService.userOn(this.session.getUser(), objectToCheck).has(permissionToCheck); - - - // return true/false - return hasPermission; - } -} diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/AbstractCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/AbstractCommand.java index e4d0593835b..1876d47fc07 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/AbstractCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/AbstractCommand.java @@ -16,7 +16,7 @@ */ public abstract class AbstractCommand implements Command { - private final Map affectedDataverses; + private final Map affectedDvObjects; private final DataverseRequest request; static protected class DvNamePair { @@ -47,21 +47,21 @@ public AbstractCommand(DataverseRequest aRequest, DvObject anAffectedDvObject) { public AbstractCommand(DataverseRequest aRequest, DvNamePair dvp, DvNamePair... more) { request = aRequest; - affectedDataverses = new HashMap<>(); - affectedDataverses.put(dvp.name, dvp.dvObject); + affectedDvObjects = new HashMap<>(); + affectedDvObjects.put(dvp.name, dvp.dvObject); for (DvNamePair p : more) { - affectedDataverses.put(p.name, p.dvObject); + affectedDvObjects.put(p.name, p.dvObject); } } public AbstractCommand(DataverseRequest aRequest, Map someAffectedDvObjects) { request = aRequest; - affectedDataverses = someAffectedDvObjects; + affectedDvObjects = someAffectedDvObjects; } @Override public Map getAffectedDvObjects() { - return affectedDataverses; + return affectedDvObjects; } @Override @@ -81,4 +81,17 @@ public Map> getRequiredPermissions() { protected User getUser() { return getRequest().getUser(); } + + @Override + public String describe() { + StringBuilder sb = new StringBuilder(); + for (Map.Entry ent : affectedDvObjects.entrySet()) { + DvObject value = ent.getValue(); + sb.append(ent.getKey()).append(":"); + sb.append((value != null) ? value.accept(DvObject.NameIdPrinter) : ""); + sb.append(" "); + } + return sb.toString(); + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/Command.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/Command.java index 32a8a3cb282..c6093432092 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/Command.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/Command.java @@ -41,5 +41,6 @@ public interface Command { * @return A map of the permissions required for this command */ Map> getRequiredPermissions(); - + + public String describe(); } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AbstractPublishDatasetCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AbstractPublishDatasetCommand.java index 38708a8efac..9f04f64e0b6 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AbstractPublishDatasetCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AbstractPublishDatasetCommand.java @@ -21,11 +21,7 @@ public AbstractPublishDatasetCommand(Dataset datasetIn, DataverseRequest aReques } protected WorkflowContext buildContext( String doiProvider, WorkflowContext.TriggerType triggerType) { - return new WorkflowContext(getRequest(), theDataset, - theDataset.getLatestVersion().getVersionNumber(), - theDataset.getLatestVersion().getMinorVersionNumber(), - triggerType, - doiProvider); + return new WorkflowContext(getRequest(), theDataset, doiProvider, triggerType); } } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AddLockCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AddLockCommand.java index 1f9ee1e96c2..3001d1532e1 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AddLockCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AddLockCommand.java @@ -28,8 +28,9 @@ public AddLockCommand(DataverseRequest aRequest, Dataset aDataset, DatasetLock a @Override public DatasetLock execute(CommandContext ctxt) throws CommandException { - lock.setDataset(dataset); + ctxt.datasets().addDatasetLock(dataset, lock); + return lock; } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AssignRoleCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AssignRoleCommand.java index 767bee92619..34263599ff0 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AssignRoleCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/AssignRoleCommand.java @@ -62,4 +62,9 @@ public Map> getRequiredPermissions() { : Collections.singleton(Permission.ManageDatasetPermissions)); } + @Override + public String describe() { + return grantee + " has been given " + role + " on " + defPoint.accept(DvObject.NameIdPrinter); + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java index 6e4e91b07bf..4fba6cf65d0 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDatasetCommand.java @@ -6,6 +6,7 @@ import edu.harvard.iq.dataverse.api.imports.ImportUtil.ImportType; import edu.harvard.iq.dataverse.authorization.Permission; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.dataaccess.DataAccess; import edu.harvard.iq.dataverse.datacapturemodule.DataCaptureModuleUtil; import edu.harvard.iq.dataverse.datacapturemodule.ScriptRequestResponse; import edu.harvard.iq.dataverse.engine.command.AbstractCommand; @@ -15,6 +16,7 @@ import edu.harvard.iq.dataverse.engine.command.exception.CommandException; import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import java.io.IOException; import java.sql.Timestamp; import java.text.SimpleDateFormat; import java.util.Date; @@ -135,13 +137,15 @@ public Dataset execute(CommandContext ctxt) throws CommandException { if (theDataset.getProtocol()==null) theDataset.setProtocol(protocol); if (theDataset.getAuthority()==null) theDataset.setAuthority(authority); if (theDataset.getDoiSeparator()==null) theDataset.setDoiSeparator(doiSeparator); - if(theDataset.getStorageIdentifier()==null) { - //FIXME: if the driver identifier is not set in the JVM options, should the storage identifier be set to file b default, or should an exception be thrown? - if(System.getProperty("dataverse.files.storage-driver-id")!=null){ - theDataset.setStorageIdentifier(System.getProperty("dataverse.files.storage-driver-id")+"://"+theDataset.getAuthority()+theDataset.getDoiSeparator()+theDataset.getIdentifier()); - } - else{ - theDataset.setStorageIdentifier("file://"+theDataset.getAuthority()+theDataset.getDoiSeparator()+theDataset.getIdentifier()); + if (theDataset.getStorageIdentifier() == null) { + try { + DataAccess.createNewStorageIO(theDataset, "placeholder"); + } catch (IOException ioex) { + // if setting the storage identifier through createNewStorageIO fails, dataset creation + // does not have to fail. we just set the storage id to a default -SF + String storageDriver = (System.getProperty("dataverse.files.storage-driver-id") != null) ? System.getProperty("dataverse.files.storage-driver-id") : "file"; + theDataset.setStorageIdentifier(storageDriver + "://" + theDataset.getAuthority()+theDataset.getDoiSeparator()+theDataset.getIdentifier()); + logger.info("Failed to create StorageIO. StorageIdentifier set to default. Not fatal." + "(" + ioex.getMessage() + ")"); } } if (theDataset.getIdentifier()==null) { diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java index c64995a6958..b78c2f316d2 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java @@ -78,8 +78,6 @@ public Dataverse execute(CommandContext ctxt) throws CommandException { created.setDefaultContributorRole(ctxt.roles().findBuiltinRoleByAlias(DataverseRole.EDITOR)); } - // By default, themeRoot should be true - created.setThemeRoot(true); // @todo for now we are saying all dataverses are permission root created.setPermissionRoot(true); diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DeleteDatasetVersionCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DeleteDatasetVersionCommand.java index 5ff5b71b836..c4d53466f82 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DeleteDatasetVersionCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DeleteDatasetVersionCommand.java @@ -36,6 +36,7 @@ public DeleteDatasetVersionCommand(DataverseRequest aRequest, Dataset dataset) { @Override protected void executeImpl(CommandContext ctxt) throws CommandException { + ctxt.permissions().checkEditDatasetLock(doomed, getRequest(), this); // if you are deleting a dataset that only has 1 draft, we are actually destroying the dataset if (doomed.getVersions().size() == 1) { diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/FinalizeDatasetPublicationCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/FinalizeDatasetPublicationCommand.java index acc07284404..faa7d3885f9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/FinalizeDatasetPublicationCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/FinalizeDatasetPublicationCommand.java @@ -23,13 +23,10 @@ import edu.harvard.iq.dataverse.privateurl.PrivateUrl; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.BundleUtil; -import edu.harvard.iq.dataverse.workflow.Workflow; import edu.harvard.iq.dataverse.workflow.WorkflowContext.TriggerType; import java.io.IOException; import java.sql.Timestamp; import java.util.Date; -import java.util.Optional; -import java.util.ResourceBundle; import java.util.logging.Level; import java.util.logging.Logger; @@ -100,16 +97,25 @@ public Dataset execute(CommandContext ctxt) throws CommandException { } theDataset.getEditVersion().setVersionState(DatasetVersion.VersionState.RELEASED); - exportMetadata(ctxt.settings()); boolean doNormalSolrDocCleanUp = true; ctxt.index().indexDataset(theDataset, doNormalSolrDocCleanUp); ctxt.solrIndex().indexPermissionsForOneDvObject(theDataset); - ctxt.engine().submit(new RemoveLockCommand(getRequest(), theDataset)); + // Remove locks + ctxt.engine().submit(new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.Workflow)); + if ( theDataset.isLockedFor(DatasetLock.Reason.InReview) ) { + ctxt.engine().submit( + new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.InReview) ); + } - ctxt.workflows().getDefaultWorkflow(TriggerType.PostPublishDataset) - .ifPresent(wf -> ctxt.workflows().start(wf, buildContext(doiProvider, TriggerType.PostPublishDataset))); + ctxt.workflows().getDefaultWorkflow(TriggerType.PostPublishDataset).ifPresent(wf -> { + try { + ctxt.workflows().start(wf, buildContext(doiProvider, TriggerType.PostPublishDataset)); + } catch (CommandException ex) { + logger.log(Level.SEVERE, "Error invoking post-publish workflow: " + ex.getMessage(), ex); + } + }); Dataset resultSet = ctxt.em().merge(theDataset); @@ -258,7 +264,7 @@ private void notifyUsersDatasetPublish(CommandContext ctxt, DvObject subject) { } /** - * Whether it's EZID or DataCiteif, if the registration is + * Whether it's EZID or DataCite, if the registration is * refused because the identifier already exists, we'll generate another one * and try to register again... but only up to some * reasonably high number of times - so that we don't diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/MoveDatasetCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/MoveDatasetCommand.java new file mode 100644 index 00000000000..8fe799c7b94 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/MoveDatasetCommand.java @@ -0,0 +1,104 @@ +/* + * To change this license header, choose License Headers in Project Properties. + * To change this template file, choose Tools | Templates + * and open the template in the editor. + */ +package edu.harvard.iq.dataverse.engine.command.impl; + +import edu.harvard.iq.dataverse.Dataset; +import edu.harvard.iq.dataverse.Dataverse; +import edu.harvard.iq.dataverse.Guestbook; +import edu.harvard.iq.dataverse.authorization.Permission; +import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.engine.command.AbstractVoidCommand; +import edu.harvard.iq.dataverse.engine.command.CommandContext; +import edu.harvard.iq.dataverse.engine.command.DataverseRequest; +import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; +import edu.harvard.iq.dataverse.engine.command.exception.CommandException; +import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; +import edu.harvard.iq.dataverse.engine.command.exception.PermissionException; +import java.util.Collections; +import java.util.List; +import java.util.logging.Level; +import java.util.logging.Logger; + +/** + * Moves Dataset from one dataverse to another + * + * @author skraffmi + */ + +// the permission annotation is open, since this is a superuser-only command - +// and that's enforced in the command body: +@RequiredPermissions({}) +public class MoveDatasetCommand extends AbstractVoidCommand { + + private static final Logger logger = Logger.getLogger(MoveDatasetCommand.class.getCanonicalName()); + final Dataset moved; + final Dataverse destination; + final Boolean force; + + public MoveDatasetCommand(DataverseRequest aRequest, Dataset moved, Dataverse destination, Boolean force) { + super(aRequest, moved); + this.moved = moved; + this.destination = destination; + this.force= force; + } + + @Override + public void executeImpl(CommandContext ctxt) throws CommandException { + + // first check if user is a superuser + if ( (!(getUser() instanceof AuthenticatedUser) || !getUser().isSuperuser() ) ) { + throw new PermissionException("Move Dataset can only be called by superusers.", + this, Collections.singleton(Permission.DeleteDatasetDraft), moved); + } + + + // validate the move makes sense + if (moved.getOwner().equals(destination)) { + throw new IllegalCommandException("Dataset already in this Dataverse ", this); + } + + // if dataset is published make sure that its target is published + + if (moved.isReleased() && !destination.isReleased()){ + throw new IllegalCommandException("Published Dataset may not be moved to unpublished Dataverse. You may publish " + destination.getDisplayName() + " and re-try the move.", this); + } + + //if the datasets guestbook is not contained in the new dataverse then remove it + if (moved.getGuestbook() != null) { + Guestbook gb = moved.getGuestbook(); + List gbs = destination.getGuestbooks(); + boolean inheritGuestbooksValue = !destination.isGuestbookRoot(); + if (inheritGuestbooksValue && destination.getOwner() != null) { + for (Guestbook pg : destination.getParentGuestbooks()) { + + gbs.add(pg); + } + } + if (gbs == null || !gbs.contains(gb)) { + if (force == null || !force){ + throw new IllegalCommandException("Dataset guestbook is not in target dataverse. Please use the parameter ?forceMove=true to complete the move. This will delete the guestbook from the Dataset", this); + } + moved.setGuestbook(null); + } + } + + // OK, move + moved.setOwner(destination); + ctxt.em().merge(moved); + + try { + boolean doNormalSolrDocCleanUp = true; + ctxt.index().indexDataset(moved, doNormalSolrDocCleanUp); + + } catch (Exception e) { // RuntimeException e ) { + logger.log(Level.WARNING, "Exception while indexing:" + e.getMessage()); //, e); + throw new CommandException("Dataset could not be moved. Indexing failed", this); + + } + + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java index 22732ea34f7..430a2778d80 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/PublishDatasetCommand.java @@ -1,14 +1,8 @@ package edu.harvard.iq.dataverse.engine.command.impl; -import edu.harvard.iq.dataverse.DataFile; import edu.harvard.iq.dataverse.Dataset; import edu.harvard.iq.dataverse.DatasetLock; -import edu.harvard.iq.dataverse.DatasetVersionUser; -import edu.harvard.iq.dataverse.DvObject; -import edu.harvard.iq.dataverse.UserNotification; -import edu.harvard.iq.dataverse.*; import edu.harvard.iq.dataverse.authorization.Permission; -import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import edu.harvard.iq.dataverse.engine.command.CommandContext; import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; @@ -17,14 +11,11 @@ import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.workflow.Workflow; import edu.harvard.iq.dataverse.workflow.WorkflowContext.TriggerType; -import edu.harvard.iq.dataverse.util.BundleUtil; -import java.io.IOException; -import java.sql.Timestamp; -import java.util.Date; import java.util.Optional; +import static java.util.stream.Collectors.joining; /** - * Kick-off a dataset publication process. The process may complete immediatly, + * Kick-off a dataset publication process. The process may complete immediately, * but may also result in a workflow being started and pending on some external * response. Either way, the process will be completed by an instance of * {@link FinalizeDatasetPublicationCommand}. @@ -64,20 +55,17 @@ public PublishDatasetResult execute(CommandContext ctxt) throws CommandException theDataset.getEditVersion().setVersionNumber(new Long(theDataset.getVersionNumber())); theDataset.getEditVersion().setMinorVersionNumber(new Long(theDataset.getMinorVersionNumber() + 1)); - } else /* major, non-first release */ { + } else { + // major, non-first release theDataset.getEditVersion().setVersionNumber(new Long(theDataset.getVersionNumber() + 1)); theDataset.getEditVersion().setMinorVersionNumber(new Long(0)); } theDataset = ctxt.em().merge(theDataset); - //Move remove lock to after merge... SEK 9/1/17 (why? -- L.A.) - ctxt.engine().submit( new RemoveLockCommand(getRequest(), theDataset)); - Optional prePubWf = ctxt.workflows().getDefaultWorkflow(TriggerType.PrePublishDataset); if ( prePubWf.isPresent() ) { // We start a workflow - ctxt.engine().submit( new AddLockCommand(getRequest(), theDataset, new DatasetLock(DatasetLock.Reason.Workflow, getRequest().getAuthenticatedUser()))); ctxt.workflows().start(prePubWf.get(), buildContext(doiProvider, TriggerType.PrePublishDataset) ); return new PublishDatasetResult(theDataset, false); @@ -100,9 +88,11 @@ private void verifyCommandArguments() throws IllegalCommandException { throw new IllegalCommandException("This dataset may not be published because its host dataverse (" + theDataset.getOwner().getAlias() + ") has not been published.", this); } - if (theDataset.isLocked() && !theDataset.getDatasetLock().getReason().equals(DatasetLock.Reason.InReview)) { - - throw new IllegalCommandException("This dataset is locked. Reason: " + theDataset.getDatasetLock().getReason().toString() + ". Please try publishing later.", this); + if ( theDataset.isLockedFor(DatasetLock.Reason.Workflow) + || theDataset.isLockedFor(DatasetLock.Reason.Ingest) ) { + throw new IllegalCommandException("This dataset is locked. Reason: " + + theDataset.getLocks().stream().map(l -> l.getReason().name()).collect( joining(",") ) + + ". Please try publishing later.", this); } if (theDataset.getLatestVersion().isReleased()) { diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/RemoveLockCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/RemoveLockCommand.java index 669e00ea9ba..b9c2f20f37c 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/RemoveLockCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/RemoveLockCommand.java @@ -1,6 +1,7 @@ package edu.harvard.iq.dataverse.engine.command.impl; import edu.harvard.iq.dataverse.Dataset; +import edu.harvard.iq.dataverse.DatasetLock; import edu.harvard.iq.dataverse.authorization.Permission; import edu.harvard.iq.dataverse.engine.command.AbstractVoidCommand; import edu.harvard.iq.dataverse.engine.command.CommandContext; @@ -16,15 +17,17 @@ public class RemoveLockCommand extends AbstractVoidCommand { private final Dataset dataset; + private final DatasetLock.Reason reason; - public RemoveLockCommand(DataverseRequest aRequest, Dataset aDataset) { + public RemoveLockCommand(DataverseRequest aRequest, Dataset aDataset, DatasetLock.Reason aReason) { super(aRequest, aDataset); dataset = aDataset; + reason = aReason; } @Override protected void executeImpl(CommandContext ctxt) throws CommandException { - ctxt.datasets().removeDatasetLock(dataset.getId()); + ctxt.datasets().removeDatasetLocks(dataset.getId(), reason); } } diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/ReturnDatasetToAuthorCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/ReturnDatasetToAuthorCommand.java index 3ee601bde30..fc5272dc406 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/ReturnDatasetToAuthorCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/ReturnDatasetToAuthorCommand.java @@ -1,6 +1,7 @@ package edu.harvard.iq.dataverse.engine.command.impl; import edu.harvard.iq.dataverse.Dataset; +import edu.harvard.iq.dataverse.DatasetLock; import edu.harvard.iq.dataverse.DatasetVersionUser; import edu.harvard.iq.dataverse.UserNotification; import edu.harvard.iq.dataverse.authorization.Permission; @@ -44,7 +45,7 @@ public Dataset execute(CommandContext ctxt) throws CommandException { throw new IllegalCommandException("You must enter a reason for returning a dataset to the author(s).", this); } */ - ctxt.engine().submit( new RemoveLockCommand(getRequest(), theDataset)); + ctxt.engine().submit( new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.InReview)); Dataset updatedDataset = save(ctxt); return updatedDataset; @@ -56,7 +57,15 @@ public Dataset save(CommandContext ctxt) throws CommandException { theDataset.getEditVersion().setLastUpdateTime(updateTime); // We set "in review" to false because now the ball is back in the author's court. theDataset.setModificationTime(updateTime); - theDataset.setDatasetLock(null); + // TODO: ctxt.datasets().removeDatasetLocks() doesn't work. Try RemoveLockCommand? + AuthenticatedUser authenticatedUser = null; + for (DatasetLock lock : theDataset.getLocks()) { + if (DatasetLock.Reason.InReview.equals(lock.getReason())) { + theDataset.removeLock(lock); + // TODO: Are we supposed to remove the dataset lock from the user? What's going on here? + authenticatedUser = lock.getUser(); + } + } Dataset savedDataset = ctxt.em().merge(theDataset); ctxt.em().flush(); diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java index e5a7fde5cdb..fb3824f541c 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetCommand.java @@ -13,7 +13,6 @@ import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; import edu.harvard.iq.dataverse.engine.command.exception.CommandException; -import edu.harvard.iq.dataverse.engine.command.exception.CommandExecutionException; import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import java.sql.Timestamp; @@ -78,6 +77,7 @@ public void setValidateLenient(boolean validateLenient) { @Override public Dataset execute(CommandContext ctxt) throws CommandException { + ctxt.permissions().checkEditDatasetLock(theDataset, getRequest(), this); // first validate // @todo for now we run through an initFields method that creates empty fields for anything without a value // that way they can be checked for required diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetVersionCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetVersionCommand.java index 05bd79c275d..1d1c31315c0 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetVersionCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetVersionCommand.java @@ -36,6 +36,7 @@ public UpdateDatasetVersionCommand(DataverseRequest aRequest, DatasetVersion the public DatasetVersion execute(CommandContext ctxt) throws CommandException { Dataset ds = newVersion.getDataset(); + ctxt.permissions().checkEditDatasetLock(ds, getRequest(), this); DatasetVersion latest = ds.getLatestVersion(); if ( latest == null ) { diff --git a/src/main/java/edu/harvard/iq/dataverse/export/ExportService.java b/src/main/java/edu/harvard/iq/dataverse/export/ExportService.java index 7c8ade78a20..f62013254e0 100644 --- a/src/main/java/edu/harvard/iq/dataverse/export/ExportService.java +++ b/src/main/java/edu/harvard/iq/dataverse/export/ExportService.java @@ -245,7 +245,7 @@ private void cacheExport(DatasetVersion version, String format, JsonObject datas Dataset dataset = version.getDataset(); StorageIO storageIO = null; try { - storageIO = DataAccess.createNewStorageIO(dataset, "file"); + storageIO = DataAccess.createNewStorageIO(dataset, "placeholder"); Channel outputChannel = storageIO.openAuxChannel("export_" + format + ".cached", DataAccessOption.WRITE_ACCESS); outputStream = Channels.newOutputStream((WritableByteChannel) outputChannel); } catch (IOException ioex) { diff --git a/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java b/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java new file mode 100644 index 00000000000..e039407fcf2 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/export/SchemaDotOrgExporter.java @@ -0,0 +1,86 @@ +package edu.harvard.iq.dataverse.export; + +import com.google.auto.service.AutoService; +import edu.harvard.iq.dataverse.DatasetVersion; +import edu.harvard.iq.dataverse.export.spi.Exporter; +import edu.harvard.iq.dataverse.util.BundleUtil; +import java.io.IOException; +import java.io.OutputStream; +import java.io.StringReader; +import java.util.logging.Logger; +import javax.json.Json; +import javax.json.JsonObject; +import javax.json.JsonReader; + +@AutoService(Exporter.class) +public class SchemaDotOrgExporter implements Exporter { + + private static final Logger logger = Logger.getLogger(SchemaDotOrgExporter.class.getCanonicalName()); + + public static final String NAME = "schema.org"; + + @Override + public void exportDataset(DatasetVersion version, JsonObject json, OutputStream outputStream) throws ExportException { + String jsonLdAsString = version.getJsonLd(); + StringReader stringReader = new StringReader(jsonLdAsString); + JsonReader jsonReader = Json.createReader(stringReader); + JsonObject jsonLdJsonObject = jsonReader.readObject(); + try { + outputStream.write(jsonLdJsonObject.toString().getBytes("UTF8")); + } catch (IOException ex) { + logger.info("IOException calling outputStream.write: " + ex); + } + try { + outputStream.flush(); + } catch (IOException ex) { + logger.info("IOException calling outputStream.flush: " + ex); + } + } + + @Override + public String getProviderName() { + return NAME; + } + + @Override + public String getDisplayName() { + return BundleUtil.getStringFromBundle("dataset.exportBtn.itemLabel.schemaDotOrg"); + } + + @Override + public Boolean isXMLFormat() { + return false; + } + + @Override + public Boolean isHarvestable() { + // Defer harvesting because the current effort was estimated as a "2": https://github.com/IQSS/dataverse/issues/3700 + return false; + } + + @Override + public Boolean isAvailableToUsers() { + return true; + } + + @Override + public String getXMLNameSpace() throws ExportException { + throw new ExportException(SchemaDotOrgExporter.class.getSimpleName() + ": not an XML format."); + } + + @Override + public String getXMLSchemaLocation() throws ExportException { + throw new ExportException(SchemaDotOrgExporter.class.getSimpleName() + ": not an XML format."); + } + + @Override + public String getXMLSchemaVersion() throws ExportException { + throw new ExportException(SchemaDotOrgExporter.class.getSimpleName() + ": not an XML format."); + } + + @Override + public void setParam(String name, Object value) { + // this exporter doesn't need/doesn't currently take any parameters + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalTool.java b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalTool.java new file mode 100644 index 00000000000..fee92b6c0b9 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalTool.java @@ -0,0 +1,218 @@ +package edu.harvard.iq.dataverse.externaltools; + +import java.io.Serializable; +import java.util.Arrays; +import javax.json.Json; +import javax.json.JsonObjectBuilder; +import javax.persistence.Column; +import javax.persistence.Entity; +import javax.persistence.EnumType; +import javax.persistence.Enumerated; +import javax.persistence.GeneratedValue; +import javax.persistence.GenerationType; +import javax.persistence.Id; + +/** + * A specification or definition for how an external tool is intended to + * operate. The specification is applied dynamically on a per-file basis through + * an {@link ExternalToolHandler}. + */ +@Entity +public class ExternalTool implements Serializable { + + public static final String DISPLAY_NAME = "displayName"; + public static final String DESCRIPTION = "description"; + public static final String TYPE = "type"; + public static final String TOOL_URL = "toolUrl"; + public static final String TOOL_PARAMETERS = "toolParameters"; + + @Id + @GeneratedValue(strategy = GenerationType.IDENTITY) + private Long id; + + /** + * The display name (on the button, for example) of the tool in English. + */ + // TODO: How are we going to internationalize the display name? + @Column(nullable = false) + private String displayName; + + /** + * The description of the tool in English. + */ + // TODO: How are we going to internationalize the description? + @Column(nullable = false, columnDefinition = "TEXT") + private String description; + + /** + * Whether the tool is an "explore" tool or a "configure" tool, for example. + */ + @Column(nullable = false) + @Enumerated(EnumType.STRING) + private Type type; + + @Column(nullable = false) + private String toolUrl; + + /** + * Parameters the tool requires such as DataFile id and API Token as a JSON + * object, persisted as a String. + */ + @Column(nullable = false) + private String toolParameters; + + /** + * This default constructor is only here to prevent this error at + * deployment: + * + * Exception Description: The instance creation method + * [...ExternalTool.], with no parameters, does not + * exist, or is not accessible + * + * Don't use it. + */ + @Deprecated + public ExternalTool() { + } + + public ExternalTool(String displayName, String description, Type type, String toolUrl, String toolParameters) { + this.displayName = displayName; + this.description = description; + this.type = type; + this.toolUrl = toolUrl; + this.toolParameters = toolParameters; + } + + public enum Type { + + EXPLORE("explore"), + CONFIGURE("configure"); + + private final String text; + + private Type(final String text) { + this.text = text; + } + + public static Type fromString(String text) { + if (text != null) { + for (Type type : Type.values()) { + if (text.equals(type.text)) { + return type; + } + } + } + throw new IllegalArgumentException("Type must be one of these values: " + Arrays.asList(Type.values()) + "."); + } + + @Override + public String toString() { + return text; + } + } + + public Long getId() { + return id; + } + + public void setId(Long id) { + this.id = id; + } + + public String getDisplayName() { + return displayName; + } + + public void setDisplayName(String displayName) { + this.displayName = displayName; + } + + public String getDescription() { + return description; + } + + public void setDescription(String description) { + this.description = description; + } + + public Type getType() { + return type; + } + + public String getToolUrl() { + return toolUrl; + } + + public void setToolUrl(String toolUrl) { + this.toolUrl = toolUrl; + } + + public String getToolParameters() { + return toolParameters; + } + + public void setToolParameters(String toolParameters) { + this.toolParameters = toolParameters; + } + + public JsonObjectBuilder toJson() { + JsonObjectBuilder jab = Json.createObjectBuilder(); + jab.add("id", getId()); + jab.add(DISPLAY_NAME, getDisplayName()); + jab.add(DESCRIPTION, getDescription()); + jab.add(TYPE, getType().text); + jab.add(TOOL_URL, getToolUrl()); + jab.add(TOOL_PARAMETERS, getToolParameters()); + return jab; + } + + public enum ReservedWord { + + // TODO: Research if a format like "{reservedWord}" is easily parse-able or if another format would be + // better. The choice of curly braces is somewhat arbitrary, but has been observed in documenation for + // various REST APIs. For example, "Variable substitutions will be made when a variable is named in {brackets}." + // from https://swagger.io/specification/#fixed-fields-29 but that's for URLs. + FILE_ID("fileId"), + SITE_URL("siteUrl"), + API_TOKEN("apiToken"); + + private final String text; + private final String START = "{"; + private final String END = "}"; + + private ReservedWord(final String text) { + this.text = START + text + END; + } + + /** + * This is a centralized method that enforces that only reserved words + * are allowed to be used by external tools. External tool authors + * cannot pass their own query parameters through Dataverse such as + * "mode=mode1". + * + * @throws IllegalArgumentException + */ + public static ReservedWord fromString(String text) throws IllegalArgumentException { + if (text != null) { + for (ReservedWord reservedWord : ReservedWord.values()) { + if (text.equals(reservedWord.text)) { + return reservedWord; + } + } + } + // TODO: Consider switching to a more informative message that enumerates the valid reserved words. + boolean moreInformativeMessage = false; + if (moreInformativeMessage) { + throw new IllegalArgumentException("Unknown reserved word: " + text + ". A reserved word must be one of these values: " + Arrays.asList(ReservedWord.values()) + "."); + } else { + throw new IllegalArgumentException("Unknown reserved word: " + text); + } + } + + @Override + public String toString() { + return text; + } + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolHandler.java b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolHandler.java new file mode 100644 index 00000000000..9fb5a72d9ba --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolHandler.java @@ -0,0 +1,107 @@ +package edu.harvard.iq.dataverse.externaltools; + +import edu.harvard.iq.dataverse.DataFile; +import edu.harvard.iq.dataverse.authorization.users.ApiToken; +import edu.harvard.iq.dataverse.externaltools.ExternalTool.ReservedWord; +import edu.harvard.iq.dataverse.util.SystemConfig; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.List; +import java.util.logging.Logger; +import javax.json.Json; +import javax.json.JsonArray; +import javax.json.JsonObject; +import javax.json.JsonReader; + +/** + * Handles an operation on a specific file. Requires a file id in order to be + * instantiated. Applies logic based on an {@link ExternalTool} specification, + * such as constructing a URL to access that file. + */ +public class ExternalToolHandler { + + private static final Logger logger = Logger.getLogger(ExternalToolHandler.class.getCanonicalName()); + + private final ExternalTool externalTool; + private final DataFile dataFile; + + private final ApiToken apiToken; + + /** + * @param externalTool The database entity. + * @param dataFile Required. + * @param apiToken The apiToken can be null because "explore" tools can be + * used anonymously. + */ + public ExternalToolHandler(ExternalTool externalTool, DataFile dataFile, ApiToken apiToken) { + this.externalTool = externalTool; + if (dataFile == null) { + String error = "A DataFile is required."; + logger.warning("Error in ExternalToolHandler constructor: " + error); + throw new IllegalArgumentException(error); + } + this.dataFile = dataFile; + this.apiToken = apiToken; + } + + public DataFile getDataFile() { + return dataFile; + } + + public ApiToken getApiToken() { + return apiToken; + } + + // TODO: rename to handleRequest() to someday handle sending headers as well as query parameters. + public String getQueryParametersForUrl() { + String toolParameters = externalTool.getToolParameters(); + JsonReader jsonReader = Json.createReader(new StringReader(toolParameters)); + JsonObject obj = jsonReader.readObject(); + JsonArray queryParams = obj.getJsonArray("queryParameters"); + if (queryParams == null || queryParams.isEmpty()) { + return ""; + } + List params = new ArrayList<>(); + queryParams.getValuesAs(JsonObject.class).forEach((queryParam) -> { + queryParam.keySet().forEach((key) -> { + String value = queryParam.getString(key); + String param = getQueryParam(key, value); + if (param != null && !param.isEmpty()) { + params.add(getQueryParam(key, value)); + } + }); + }); + return "?" + String.join("&", params); + } + + private String getQueryParam(String key, String value) { + ReservedWord reservedWord = ReservedWord.fromString(value); + switch (reservedWord) { + case FILE_ID: + // getDataFile is never null because of the constructor + return key + "=" + getDataFile().getId(); + case SITE_URL: + return key + "=" + SystemConfig.getDataverseSiteUrlStatic(); + case API_TOKEN: + String apiTokenString = null; + ApiToken theApiToken = getApiToken(); + if (theApiToken != null) { + apiTokenString = theApiToken.getTokenString(); + return key + "=" + apiTokenString; + } + break; + default: + break; + } + return null; + } + + public String getToolUrlWithQueryParams() { + return externalTool.getToolUrl() + getQueryParametersForUrl(); + } + + public ExternalTool getExternalTool() { + return externalTool; + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolServiceBean.java new file mode 100644 index 00000000000..35406a7f22b --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/externaltools/ExternalToolServiceBean.java @@ -0,0 +1,139 @@ +package edu.harvard.iq.dataverse.externaltools; + +import edu.harvard.iq.dataverse.DataFile; +import edu.harvard.iq.dataverse.externaltools.ExternalTool.ReservedWord; +import static edu.harvard.iq.dataverse.externaltools.ExternalTool.DESCRIPTION; +import static edu.harvard.iq.dataverse.externaltools.ExternalTool.DISPLAY_NAME; +import static edu.harvard.iq.dataverse.externaltools.ExternalTool.TOOL_PARAMETERS; +import static edu.harvard.iq.dataverse.externaltools.ExternalTool.TOOL_URL; +import static edu.harvard.iq.dataverse.externaltools.ExternalTool.TYPE; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.List; +import java.util.Set; +import java.util.logging.Logger; +import javax.ejb.Stateless; +import javax.inject.Named; +import javax.json.Json; +import javax.json.JsonArray; +import javax.json.JsonObject; +import javax.json.JsonReader; +import javax.persistence.EntityManager; +import javax.persistence.NoResultException; +import javax.persistence.NonUniqueResultException; +import javax.persistence.PersistenceContext; +import javax.persistence.TypedQuery; + +@Stateless +@Named +public class ExternalToolServiceBean { + + private static final Logger logger = Logger.getLogger(ExternalToolServiceBean.class.getCanonicalName()); + + @PersistenceContext(unitName = "VDCNet-ejbPU") + private EntityManager em; + + public List findAll() { + TypedQuery typedQuery = em.createQuery("SELECT OBJECT(o) FROM ExternalTool AS o ORDER BY o.id", ExternalTool.class); + return typedQuery.getResultList(); + } + + + /** + * @return A list of tools or an empty list. + */ + public List findByType(ExternalTool.Type type) { + List externalTools = new ArrayList<>(); + TypedQuery typedQuery = em.createQuery("SELECT OBJECT(o) FROM ExternalTool AS o WHERE o.type = :type", ExternalTool.class); + typedQuery.setParameter("type", type); + List toolsFromQuery = typedQuery.getResultList(); + if (toolsFromQuery != null) { + externalTools = toolsFromQuery; + } + return externalTools; + } + + public ExternalTool findById(long id) { + TypedQuery typedQuery = em.createQuery("SELECT OBJECT(o) FROM ExternalTool AS o WHERE o.id = :id", ExternalTool.class); + typedQuery.setParameter("id", id); + try { + ExternalTool externalTool = typedQuery.getSingleResult(); + return externalTool; + } catch (NoResultException | NonUniqueResultException ex) { + return null; + } + } + + public boolean delete(long doomedId) { + ExternalTool doomed = findById(doomedId); + try { + em.remove(doomed); + return true; + } catch (Exception ex) { + logger.info("Could not delete external tool with id of " + doomedId); + return false; + } + } + + public ExternalTool save(ExternalTool externalTool) { + em.persist(externalTool); + return em.merge(externalTool); + } + + /** + * This method takes a list of tools and a file and returns which tools that file supports + * The list of tools is passed in so it doesn't hit the database each time + */ + public static List findExternalToolsByFile(List allExternalTools, DataFile file) { + List externalTools = new ArrayList<>(); + allExternalTools.forEach((externalTool) -> { + if (file.isTabularData()) { + externalTools.add(externalTool); + } + }); + + return externalTools; + } + + public static ExternalTool parseAddExternalToolManifest(String manifest) { + if (manifest == null || manifest.isEmpty()) { + throw new IllegalArgumentException("External tool manifest was null or empty!"); + } + JsonReader jsonReader = Json.createReader(new StringReader(manifest)); + JsonObject jsonObject = jsonReader.readObject(); + String displayName = getRequiredTopLevelField(jsonObject, DISPLAY_NAME); + String description = getRequiredTopLevelField(jsonObject, DESCRIPTION); + String typeUserInput = getRequiredTopLevelField(jsonObject, TYPE); + // Allow IllegalArgumentException to bubble up from ExternalTool.Type.fromString + ExternalTool.Type type = ExternalTool.Type.fromString(typeUserInput); + String toolUrl = getRequiredTopLevelField(jsonObject, TOOL_URL); + JsonObject toolParametersObj = jsonObject.getJsonObject(TOOL_PARAMETERS); + JsonArray queryParams = toolParametersObj.getJsonArray("queryParameters"); + boolean allRequiredReservedWordsFound = false; + for (JsonObject queryParam : queryParams.getValuesAs(JsonObject.class)) { + Set keyValuePair = queryParam.keySet(); + for (String key : keyValuePair) { + String value = queryParam.getString(key); + ReservedWord reservedWord = ReservedWord.fromString(value); + if (reservedWord.equals(ReservedWord.FILE_ID)) { + allRequiredReservedWordsFound = true; + } + } + } + if (!allRequiredReservedWordsFound) { + // Some day there might be more reserved words than just {fileId}. + throw new IllegalArgumentException("Required reserved word not found: " + ReservedWord.FILE_ID.toString()); + } + String toolParameters = toolParametersObj.toString(); + return new ExternalTool(displayName, description, type, toolUrl, toolParameters); + } + + private static String getRequiredTopLevelField(JsonObject jsonObject, String key) { + try { + return jsonObject.getString(key); + } catch (NullPointerException ex) { + throw new IllegalArgumentException(key + " is required."); + } + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/ingest/IngestMessageBean.java b/src/main/java/edu/harvard/iq/dataverse/ingest/IngestMessageBean.java index 46fa7370d3f..c9886dcab13 100644 --- a/src/main/java/edu/harvard/iq/dataverse/ingest/IngestMessageBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/ingest/IngestMessageBean.java @@ -20,30 +20,23 @@ package edu.harvard.iq.dataverse.ingest; -import edu.harvard.iq.dataverse.DatasetVersion; import edu.harvard.iq.dataverse.DatasetServiceBean; import edu.harvard.iq.dataverse.DataFileServiceBean; import edu.harvard.iq.dataverse.DataFile; import edu.harvard.iq.dataverse.Dataset; -import edu.harvard.iq.dataverse.ingest.IngestServiceBean; +import edu.harvard.iq.dataverse.DatasetLock; -import java.io.File; -import java.util.ArrayList; import java.util.Iterator; -import java.util.List; import java.util.logging.Logger; import javax.ejb.ActivationConfigProperty; import javax.ejb.EJB; import javax.ejb.MessageDriven; import javax.ejb.TransactionAttribute; import javax.ejb.TransactionAttributeType; -import javax.faces.application.FacesMessage; import javax.jms.JMSException; import javax.jms.Message; import javax.jms.MessageListener; import javax.jms.ObjectMessage; -import javax.naming.Context; -import javax.naming.InitialContext; /** * @@ -135,7 +128,7 @@ public void onMessage(Message message) { if (datafile != null) { Dataset dataset = datafile.getOwner(); if (dataset != null && dataset.getId() != null) { - datasetService.removeDatasetLock(dataset.getId()); + datasetService.removeDatasetLocks(dataset.getId(), DatasetLock.Reason.Ingest); } } } diff --git a/src/main/java/edu/harvard/iq/dataverse/mydata/DataRetrieverAPI.java b/src/main/java/edu/harvard/iq/dataverse/mydata/DataRetrieverAPI.java index c369c1f52e0..efcfdbdae96 100644 --- a/src/main/java/edu/harvard/iq/dataverse/mydata/DataRetrieverAPI.java +++ b/src/main/java/edu/harvard/iq/dataverse/mydata/DataRetrieverAPI.java @@ -4,6 +4,7 @@ package edu.harvard.iq.dataverse.mydata; import edu.harvard.iq.dataverse.DataverseRoleServiceBean; +import edu.harvard.iq.dataverse.DataverseServiceBean; import edu.harvard.iq.dataverse.DataverseSession; import edu.harvard.iq.dataverse.DvObjectServiceBean; import edu.harvard.iq.dataverse.RoleAssigneeServiceBean; @@ -22,6 +23,7 @@ import edu.harvard.iq.dataverse.search.SearchException; import edu.harvard.iq.dataverse.search.SearchFields; import edu.harvard.iq.dataverse.search.SortBy; +import java.math.BigDecimal; import java.util.List; import java.util.Map; import java.util.Random; @@ -67,6 +69,8 @@ public class DataRetrieverAPI extends AbstractApiBean { SearchServiceBean searchService; @EJB AuthenticationServiceBean authenticationService; + @EJB + DataverseServiceBean dataverseService; //@EJB //MyDataQueryHelperServiceBean myDataQueryHelperServiceBean; @EJB @@ -522,6 +526,12 @@ private JsonArrayBuilder formatSolrDocs(SolrQueryResponse solrResponse, RoleTagR // ------------------------------------------- myDataCardInfo = doc.getJsonForMyData(); + if (!doc.getEntity().isInstanceofDataFile()){ + String parentAlias = dataverseService.getParentAliasString(doc); + System.out.print("parentAlias: " + parentAlias); + myDataCardInfo.add("parent_alias",parentAlias); + } + // ------------------------------------------- // (b) Add role info // ------------------------------------------- diff --git a/src/main/java/edu/harvard/iq/dataverse/passwordreset/PasswordResetPage.java b/src/main/java/edu/harvard/iq/dataverse/passwordreset/PasswordResetPage.java index 228c6741a80..a0b3ec437c2 100644 --- a/src/main/java/edu/harvard/iq/dataverse/passwordreset/PasswordResetPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/passwordreset/PasswordResetPage.java @@ -13,6 +13,7 @@ import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.BundleUtil; +import edu.harvard.iq.dataverse.util.SystemConfig; import java.util.logging.Level; import java.util.logging.Logger; import javax.ejb.EJB; diff --git a/src/main/java/edu/harvard/iq/dataverse/search/IndexAllServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/search/IndexAllServiceBean.java index 8c72e5296f5..6e5135e131a 100644 --- a/src/main/java/edu/harvard/iq/dataverse/search/IndexAllServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/search/IndexAllServiceBean.java @@ -60,7 +60,7 @@ public JsonObjectBuilder indexAllOrSubsetPreview(long numPartitions, long partit } JsonArrayBuilder datasetIds = Json.createArrayBuilder(); List datasets = datasetService.findAllOrSubset(numPartitions, partitionId, skipIndexed); - for (Dataset dataset : datasets) { + for (Dataset dataset : datasets) { datasetIds.add(dataset.getId()); } dvContainerIds.add("dataverses", dataverseIds); @@ -107,18 +107,33 @@ public Future indexAllOrSubset(long numPartitions, long partitionId, boo List dataverses = dataverseService.findAllOrSubset(numPartitions, partitionId, skipIndexed); int dataverseIndexCount = 0; + int dataverseFailureCount = 0; for (Dataverse dataverse : dataverses) { - dataverseIndexCount++; - logger.info("indexing dataverse " + dataverseIndexCount + " of " + dataverses.size() + " (id=" + dataverse.getId() + ", persistentId=" + dataverse.getAlias() + ")"); - Future result = indexService.indexDataverseInNewTransaction(dataverse); + try { + dataverseIndexCount++; + logger.info("indexing dataverse " + dataverseIndexCount + " of " + dataverses.size() + " (id=" + dataverse.getId() + ", persistentId=" + dataverse.getAlias() + ")"); + Future result = indexService.indexDataverseInNewTransaction(dataverse); + } catch (Exception e) { + //We want to keep running even after an exception so throw some more info into the log + dataverseFailureCount++; + logger.info("FAILURE indexing dataverse " + dataverseIndexCount + " of " + dataverses.size() + " (id=" + dataverse.getId() + ", persistentId=" + dataverse.getAlias() + ") Exception info: " + e.getMessage()); + } } int datasetIndexCount = 0; + int datasetFailureCount = 0; List datasets = datasetService.findAllOrSubset(numPartitions, partitionId, skipIndexed); for (Dataset dataset : datasets) { - datasetIndexCount++; - logger.info("indexing dataset " + datasetIndexCount + " of " + datasets.size() + " (id=" + dataset.getId() + ", persistentId=" + dataset.getGlobalId() + ")"); - Future result = indexService.indexDatasetInNewTransaction(dataset); + try { + datasetIndexCount++; + logger.info("indexing dataset " + datasetIndexCount + " of " + datasets.size() + " (id=" + dataset.getId() + ", persistentId=" + dataset.getGlobalId() + ")"); + Future result = indexService.indexDatasetInNewTransaction(dataset); + } catch (Exception e) { + //We want to keep running even after an exception so throw some more info into the log + datasetFailureCount++; + logger.info("FAILURE indexing dataset " + datasetIndexCount + " of " + datasets.size() + " (id=" + dataset.getId() + ", identifier = " + dataset.getIdentifier() + ") Exception info: " + e.getMessage()); + } + } // logger.info("advanced search fields: " + advancedSearchFields); // logger.info("not advanced search fields: " + notAdvancedSearchFields); @@ -127,6 +142,10 @@ public Future indexAllOrSubset(long numPartitions, long partitionId, boo long indexAllTimeEnd = System.currentTimeMillis(); String timeElapsed = "index all took " + (indexAllTimeEnd - indexAllTimeBegin) + " milliseconds"; logger.info(timeElapsed); + if (datasetFailureCount + dataverseFailureCount > 0){ + String failureMessage = "There were index failures. " + dataverseFailureCount + " dataverse(s) and " + datasetFailureCount + " dataset(s) failed to index. Please check the log for more information."; + logger.info(failureMessage); + } status = dataverseIndexCount + " dataverses and " + datasetIndexCount + " datasets indexed. " + timeElapsed + ". " + resultOfClearingIndexTimes + "\n"; logger.info(status); return new AsyncResult<>(status); diff --git a/src/main/java/edu/harvard/iq/dataverse/search/SearchFields.java b/src/main/java/edu/harvard/iq/dataverse/search/SearchFields.java index 0dddfd3b22e..5e6b592536d 100644 --- a/src/main/java/edu/harvard/iq/dataverse/search/SearchFields.java +++ b/src/main/java/edu/harvard/iq/dataverse/search/SearchFields.java @@ -83,7 +83,7 @@ public class SearchFields { */ public static final String IS_HARVESTED = "isHarvested"; /** - * Such as http://dx.doi.org/10.5072/FK2/HXI35W + * Such as https://doi.org/10.5072/FK2/HXI35W * * For files, the URL will be the parent dataset. */ diff --git a/src/main/java/edu/harvard/iq/dataverse/search/SearchIncludeFragment.java b/src/main/java/edu/harvard/iq/dataverse/search/SearchIncludeFragment.java index c4f48a8cd63..aae7292b029 100644 --- a/src/main/java/edu/harvard/iq/dataverse/search/SearchIncludeFragment.java +++ b/src/main/java/edu/harvard/iq/dataverse/search/SearchIncludeFragment.java @@ -1150,7 +1150,7 @@ public String dataFileChecksumDisplay(DataFile datafile) { return ""; } - if (datafile.getChecksumValue() != null && datafile.getChecksumValue() != "") { + if (datafile.getChecksumValue() != null && !StringUtils.isEmpty(datafile.getChecksumValue())) { if (datafile.getChecksumType() != null) { return " " + datafile.getChecksumType() + ": " + datafile.getChecksumValue() + " "; } diff --git a/src/main/java/edu/harvard/iq/dataverse/search/SearchServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/search/SearchServiceBean.java index 318d8febaaf..ad568edccd9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/search/SearchServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/search/SearchServiceBean.java @@ -466,7 +466,7 @@ public SolrQueryResponse search(DataverseRequest dataverseRequest, Dataverse dat */ if (type.equals("dataverses")) { solrSearchResult.setName(name); - solrSearchResult.setHtmlUrl(baseUrl + "/dataverse/" + identifier); + solrSearchResult.setHtmlUrl(baseUrl + SystemConfig.DATAVERSE_PATH + identifier); // Do not set the ImageUrl, let the search include fragment fill in // the thumbnail, similarly to how the dataset and datafile cards // are handled. diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java index 29376f04d97..8508d7ea1fb 100644 --- a/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java @@ -32,6 +32,7 @@ public class SettingsServiceBean { * So there. */ public enum Key { + AllowApiTokenLookupViaApi, /** * Ordered, comma-separated list of custom fields to show above the fold * on dataset page such as "data_type,sample,pdb" @@ -187,8 +188,6 @@ public enum Key { DoiPassword, DoiBaseurlstring, */ - /* TwoRavens location */ - TwoRavensUrl, /** Optionally override http://guides.dataverse.org . */ GuidesBaseUrl, @@ -248,17 +247,6 @@ public enum Key { will be available to users. */ GeoconnectDebug, - - /** - Whether to allow a user to view tabular files - using the TwoRavens application - This boolean effects whether a user may see the - Explore Button that links to TwoRavens - Default is false; - */ - TwoRavensTabularView, - - /** The message added to a popup upon dataset publish * @@ -308,6 +296,12 @@ Whether Harvesting (OAI) service is enabled // Option to override multiple guides with a single url NavbarGuidesUrl, + /** + * The theme for the root dataverse can get in the way when you try make + * use of HeaderCustomizationFile and LogoCustomizationFile so this is a + * way to disable it. + */ + DisableRootDataverseTheme, // Limit on how many guestbook entries to display on the guestbook-responses page: GuestbookResponsesPageDisplayLimit, diff --git a/src/main/java/edu/harvard/iq/dataverse/timer/DataverseTimerServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/timer/DataverseTimerServiceBean.java index 85750282f55..f4a30139a97 100644 --- a/src/main/java/edu/harvard/iq/dataverse/timer/DataverseTimerServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/timer/DataverseTimerServiceBean.java @@ -258,8 +258,8 @@ public void createHarvestTimer(HarvestingClient harvestingClient) { } else if (harvestingClient.getSchedulePeriod().equals(harvestingClient.SCHEDULE_PERIOD_WEEKLY)) { intervalDuration = 1000 * 60 * 60 * 24 * 7; initExpiration.set(Calendar.HOUR_OF_DAY, harvestingClient.getScheduleHourOfDay()); - initExpiration.set(Calendar.DAY_OF_WEEK, harvestingClient.getScheduleDayOfWeek()); - + initExpiration.set(Calendar.DAY_OF_WEEK, harvestingClient.getScheduleDayOfWeek() + 1); //(saved as zero-based array but Calendar is one-based.) + } else { logger.log(Level.WARNING, "Could not set timer for harvesting client id=" + harvestingClient.getId() + ", unknown schedule period: " + harvestingClient.getSchedulePeriod()); return; diff --git a/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java b/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java index ee26d1e19c5..b69631eac12 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java @@ -42,6 +42,8 @@ public class SystemConfig { @EJB AuthenticationServiceBean authenticationService; + + public static final String DATAVERSE_PATH = "/dataverse/"; /** * A JVM option for the advertised fully qualified domain name (hostname) of @@ -275,6 +277,10 @@ public static int getMinutesUntilPasswordResetTokenExpires() { * by the Settings Service configuration. */ public String getDataverseSiteUrl() { + return getDataverseSiteUrlStatic(); + } + + public static String getDataverseSiteUrlStatic() { String hostUrl = System.getProperty(SITE_URL); if (hostUrl != null && !"".equals(hostUrl)) { return hostUrl; diff --git a/src/main/java/edu/harvard/iq/dataverse/workflow/PendingWorkflowInvocation.java b/src/main/java/edu/harvard/iq/dataverse/workflow/PendingWorkflowInvocation.java index c335436f5b7..b2f4171a190 100644 --- a/src/main/java/edu/harvard/iq/dataverse/workflow/PendingWorkflowInvocation.java +++ b/src/main/java/edu/harvard/iq/dataverse/workflow/PendingWorkflowInvocation.java @@ -20,7 +20,7 @@ /** * A workflow whose current step waits for an external system to complete a - * (probably lengthy) process. Meanwhile, it sits in the database, pending. + * (probably lengthy) process. Meanwhile, it sits in the database, pending away. * * @author michael */ @@ -38,6 +38,7 @@ public class PendingWorkflowInvocation implements Serializable { @OneToOne Dataset dataset; + long nextVersionNumber; long nextMinorVersionNumber; @@ -165,5 +166,4 @@ public int getTypeOrdinal() { public void setTypeOrdinal(int typeOrdinal) { this.typeOrdinal = typeOrdinal; } - } diff --git a/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowContext.java b/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowContext.java index 09129a6d796..0cca2bd64a9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowContext.java +++ b/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowContext.java @@ -6,8 +6,8 @@ import java.util.UUID; /** - * The context in which the workflow is performed. Contains information steps might - * need, such as the dataset being worked on an version data. + * The context in which a workflow is performed. Contains information steps might + * need, such as the dataset being worked on and version data. * * Design-wise, this class allows us to add parameters to {@link WorkflowStep} without * changing its method signatures, which would break break client code. @@ -29,7 +29,16 @@ public enum TriggerType { private String invocationId = UUID.randomUUID().toString(); - public WorkflowContext(DataverseRequest request, Dataset dataset, long nextVersionNumber, long nextMinorVersionNumber, TriggerType type, String doiProvider) { + public WorkflowContext( DataverseRequest aRequest, Dataset aDataset, String doiProvider, TriggerType aTriggerType ) { + this( aRequest, aDataset, + aDataset.getLatestVersion().getVersionNumber(), + aDataset.getLatestVersion().getMinorVersionNumber(), + aTriggerType, + doiProvider); + } + + public WorkflowContext(DataverseRequest request, Dataset dataset, long nextVersionNumber, + long nextMinorVersionNumber, TriggerType type, String doiProvider) { this.request = request; this.dataset = dataset; this.nextVersionNumber = nextVersionNumber; @@ -74,5 +83,4 @@ public TriggerType getType() { return type; } - } diff --git a/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowServiceBean.java index 3791e9f3851..4b581883274 100644 --- a/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/workflow/WorkflowServiceBean.java @@ -1,5 +1,7 @@ package edu.harvard.iq.dataverse.workflow; +import edu.harvard.iq.dataverse.DatasetLock; +import edu.harvard.iq.dataverse.DatasetServiceBean; import edu.harvard.iq.dataverse.EjbDataverseEngine; import edu.harvard.iq.dataverse.RoleAssigneeServiceBean; import edu.harvard.iq.dataverse.engine.command.exception.CommandException; @@ -17,15 +19,16 @@ import java.util.List; import java.util.Map; import java.util.Optional; -import java.util.ServiceConfigurationError; -import java.util.ServiceLoader; import java.util.logging.Level; import java.util.logging.Logger; import javax.ejb.Asynchronous; import javax.ejb.EJB; import javax.ejb.Stateless; +import javax.ejb.TransactionAttribute; +import javax.ejb.TransactionAttributeType; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; +import javax.persistence.Query; /** * Service bean for managing and executing {@link Workflow}s @@ -38,8 +41,11 @@ public class WorkflowServiceBean { private static final Logger logger = Logger.getLogger(WorkflowServiceBean.class.getName()); private static final String WORKFLOW_ID_KEY = "WorkflowServiceBean.WorkflowId:"; - @PersistenceContext + @PersistenceContext(unitName = "VDCNet-ejbPU") EntityManager em; + + @EJB + DatasetServiceBean datasets; @EJB SettingsServiceBean settings; @@ -76,9 +82,13 @@ public WorkflowServiceBean() { * * @param wf the workflow to execute. * @param ctxt the context in which the workflow is executed. + * @throws CommandException If the dataset could not be locked. */ - public void start(Workflow wf, WorkflowContext ctxt) { - forward(wf, ctxt, 0); + @Asynchronous + public void start(Workflow wf, WorkflowContext ctxt) throws CommandException { + ctxt = refresh(ctxt); + lockDataset(ctxt); + forward(wf, ctxt); } /** @@ -92,37 +102,22 @@ public void start(Workflow wf, WorkflowContext ctxt) { * #doResume(edu.harvard.iq.dataverse.workflow.PendingWorkflowInvocation, * java.lang.String) */ + @Asynchronous public void resume(PendingWorkflowInvocation pending, String body) { em.remove(em.merge(pending)); doResume(pending, body); } + @Asynchronous - private void forward(Workflow wf, WorkflowContext ctxt, int idx) { - WorkflowStepData wsd = wf.getSteps().get(idx); - WorkflowStep step = createStep(wsd); - WorkflowStepResult res = step.run(ctxt); - - if (res == WorkflowStepResult.OK) { - if (idx == wf.getSteps().size() - 1) { - workflowCompleted(wf, ctxt); - } else { - forward(wf, ctxt, ++idx); - } - - } else if (res instanceof Failure) { - logger.log(Level.WARNING, "Workflow {0} failed: {1}", new Object[]{ctxt.getInvocationId(), ((Failure) res).getReason()}); - rollback(wf, ctxt, (Failure) res, idx - 1); - - } else if (res instanceof Pending) { - pauseAndAwait(wf, ctxt, (Pending) res, idx); - } + private void forward(Workflow wf, WorkflowContext ctxt) { + executeSteps(wf, ctxt, 0); } - @Asynchronous private void doResume(PendingWorkflowInvocation pending, String body) { Workflow wf = pending.getWorkflow(); List stepsLeft = wf.getSteps().subList(pending.getPendingStepIdx(), wf.getSteps().size()); + WorkflowStep pendingStep = createStep(stepsLeft.get(0)); final WorkflowContext ctxt = pending.reCreateContext(roleAssignees); @@ -132,52 +127,129 @@ private void doResume(PendingWorkflowInvocation pending, String body) { } else if (res instanceof Pending) { pauseAndAwait(wf, ctxt, (Pending) res, pending.getPendingStepIdx()); } else { - forward(wf, ctxt, pending.getPendingStepIdx() + 1); + executeSteps(wf, ctxt, pending.getPendingStepIdx() + 1); } } @Asynchronous - private void rollback(Workflow wf, WorkflowContext ctxt, Failure failure, int idx) { - WorkflowStepData wsd = wf.getSteps().get(idx); - logger.log(Level.INFO, "{0} rollback of step {1}", new Object[]{ctxt.getInvocationId(), idx}); - try { - createStep(wsd).rollback(ctxt, failure); - } finally { - if (idx > 0) { - rollback(wf, ctxt, failure, --idx); - } else { - unlockDataset(ctxt); + private void rollback(Workflow wf, WorkflowContext ctxt, Failure failure, int lastCompletedStepIdx) { + ctxt = refresh(ctxt); + final List steps = wf.getSteps(); + + for ( int stepIdx = lastCompletedStepIdx; stepIdx >= 0; --stepIdx ) { + WorkflowStepData wsd = steps.get(stepIdx); + WorkflowStep step = createStep(wsd); + + try { + logger.log(Level.INFO, "Workflow {0} step {1}: Rollback", new Object[]{ctxt.getInvocationId(), stepIdx}); + rollbackStep(step, ctxt, failure); + + } catch (Exception e) { + logger.log(Level.WARNING, "Workflow " + ctxt.getInvocationId() + + " step " + stepIdx + ": Rollback error: " + e.getMessage(), e); } + + } + + logger.log( Level.INFO, "Removing workflow lock"); + try { + engine.submit( new RemoveLockCommand(ctxt.getRequest(), ctxt.getDataset(), DatasetLock.Reason.Workflow) ); + + // Corner case - delete locks generated within this same transaction. + Query deleteQuery = em.createQuery("DELETE from DatasetLock l WHERE l.dataset.id=:id AND l.reason=:reason"); + deleteQuery.setParameter("id", ctxt.getDataset().getId() ); + deleteQuery.setParameter("reason", DatasetLock.Reason.Workflow ); + deleteQuery.executeUpdate(); + + } catch (CommandException ex) { + logger.log(Level.SEVERE, "Error restoring dataset locks state after rollback: " + ex.getMessage(), ex); } } /** - * Unlocks the dataset after the workflow is over. - * @param ctxt + * Execute the passed workflow, starting from {@code initialStepIdx}. + * @param wf The workflow to run. + * @param ctxt Execution context to run the workflow in. + * @param initialStepIdx 0-based index of the first step to run. */ - @Asynchronous - private void unlockDataset( WorkflowContext ctxt ) { - try { - engine.submit( new RemoveLockCommand(ctxt.getRequest(), ctxt.getDataset()) ); - } catch (CommandException ex) { - logger.log(Level.SEVERE, "Cannot unlock dataset after rollback: " + ex.getMessage(), ex); + private void executeSteps(Workflow wf, WorkflowContext ctxt, int initialStepIdx ) { + final List steps = wf.getSteps(); + + for ( int stepIdx = initialStepIdx; stepIdx < steps.size(); stepIdx++ ) { + WorkflowStepData wsd = steps.get(stepIdx); + WorkflowStep step = createStep(wsd); + WorkflowStepResult res = runStep(step, ctxt); + + try { + if (res == WorkflowStepResult.OK) { + logger.log(Level.INFO, "Workflow {0} step {1}: OK", new Object[]{ctxt.getInvocationId(), stepIdx}); + + } else if (res instanceof Failure) { + logger.log(Level.WARNING, "Workflow {0} failed: {1}", new Object[]{ctxt.getInvocationId(), ((Failure) res).getReason()}); + rollback(wf, ctxt, (Failure) res, stepIdx-1 ); + return; + + } else if (res instanceof Pending) { + pauseAndAwait(wf, ctxt, (Pending) res, stepIdx); + return; + } + + } catch ( Exception e ) { + logger.log(Level.WARNING, "Workflow {0} step {1}: Uncought exception:", new Object[]{ctxt.getInvocationId(), e.getMessage()}); + logger.log(Level.WARNING, "Trace:", e); + rollback(wf, ctxt, (Failure) res, stepIdx-1 ); + return; + } } + + workflowCompleted(wf, ctxt); + + } + + ////////////////////////////////////////////////////////////// + // Internal methods to run each step in its own transaction. + // + + @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) + WorkflowStepResult runStep( WorkflowStep step, WorkflowContext ctxt ) { + return step.run(ctxt); + } + + @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) + WorkflowStepResult resumeStep( WorkflowStep step, WorkflowContext ctxt, Map localData, String externalData ) { + return step.resume(ctxt, localData, externalData); + } + + @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) + void rollbackStep( WorkflowStep step, WorkflowContext ctxt, Failure reason ) { + step.rollback(ctxt, reason); + } + + @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) + void lockDataset( WorkflowContext ctxt ) throws CommandException { + final DatasetLock datasetLock = new DatasetLock(DatasetLock.Reason.Workflow, ctxt.getRequest().getAuthenticatedUser()); +// engine.submit(new AddLockCommand(ctxt.getRequest(), ctxt.getDataset(), datasetLock)); + datasetLock.setDataset(ctxt.getDataset()); + em.persist(datasetLock); + em.flush(); } + // + // + ////////////////////////////////////////////////////////////// + private void pauseAndAwait(Workflow wf, WorkflowContext ctxt, Pending pendingRes, int idx) { PendingWorkflowInvocation pending = new PendingWorkflowInvocation(wf, ctxt, pendingRes); pending.setPendingStepIdx(idx); em.persist(pending); } - @Asynchronous private void workflowCompleted(Workflow wf, WorkflowContext ctxt) { logger.log(Level.INFO, "Workflow {0} completed.", ctxt.getInvocationId()); if ( ctxt.getType() == TriggerType.PrePublishDataset ) { try { engine.submit( new FinalizeDatasetPublicationCommand(ctxt.getDataset(), ctxt.getDoiProvider(), ctxt.getRequest()) ); - unlockDataset(ctxt); - + } catch (CommandException ex) { logger.log(Level.SEVERE, "Exception finalizing workflow " + ctxt.getInvocationId() +": " + ex.getMessage(), ex); rollback(wf, ctxt, new Failure("Exception while finalizing the publication: " + ex.getMessage()), wf.steps.size()-1); @@ -273,5 +345,11 @@ private WorkflowStep createStep(WorkflowStepData wsd) { } return provider.getStep(wsd.getStepType(), wsd.getStepParameters()); } + + private WorkflowContext refresh( WorkflowContext ctxt ) { + return new WorkflowContext( ctxt.getRequest(), + datasets.find( ctxt.getDataset().getId() ), ctxt.getNextVersionNumber(), + ctxt.getNextMinorVersionNumber(), ctxt.getType(), ctxt.getDoiProvider() ); + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/HttpSendReceiveClientStep.java b/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/HttpSendReceiveClientStep.java index 8d882de5303..3bbd294ee72 100644 --- a/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/HttpSendReceiveClientStep.java +++ b/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/HttpSendReceiveClientStep.java @@ -54,7 +54,7 @@ public WorkflowStepResult run(WorkflowContext context) { } } catch (Exception ex) { - logger.log(Level.SEVERE, "Error communicating with remote server: " + ex.getMessage(), ex ); + logger.log(Level.SEVERE, "Error communicating with remote server: " + ex.getMessage(), ex); return new Failure("Error executing request: " + ex.getLocalizedMessage(), "Cannot communicate with remote server."); } } @@ -66,6 +66,7 @@ public WorkflowStepResult resume(WorkflowContext context, Map in if ( pat.matcher(response).matches() ) { return OK; } else { + logger.log(Level.WARNING, "Remote system returned a bad reposonse: {0}", externalData); return new Failure("Response from remote server did not match expected one (response:" + response + ")"); } } diff --git a/src/main/resources/META-INF/persistence.xml b/src/main/resources/META-INF/persistence.xml index 9303aa98ea4..8b4e33858ac 100644 --- a/src/main/resources/META-INF/persistence.xml +++ b/src/main/resources/META-INF/persistence.xml @@ -15,8 +15,9 @@ - + + Router diff --git a/src/main/webapp/dashboard-users.xhtml b/src/main/webapp/dashboard-users.xhtml index 5a970b5c2ed..98e2b6abddf 100644 --- a/src/main/webapp/dashboard-users.xhtml +++ b/src/main/webapp/dashboard-users.xhtml @@ -90,7 +90,7 @@ -

+

@@ -104,7 +104,7 @@ -

#{DashboardUsersPage.confirmRemoveRolesMessage}

+

#{DashboardUsersPage.confirmRemoveRolesMessage}

diff --git a/src/main/webapp/dataset-license-terms.xhtml b/src/main/webapp/dataset-license-terms.xhtml index f87de710746..3cf8e96579f 100644 --- a/src/main/webapp/dataset-license-terms.xhtml +++ b/src/main/webapp/dataset-license-terms.xhtml @@ -11,7 +11,7 @@ jsf:rendered="#{dataverseSession.user.authenticated and empty editMode and !widgetWrapper.widgetView and permissionsWrapper.canIssueUpdateDatasetCommand(DatasetPage.dataset)}"> + update="@form,:messagePanel" oncomplete="javascript:post_edit_terms()" disabled="#{DatasetPage.lockedFromEdits}"> #{bundle['file.dataFilesTab.terms.editTermsBtn']} @@ -69,17 +69,13 @@
- -
-

- -

-
+ or !empty termsOfUseAndAccess.conditions or !empty termsOfUseAndAccess.disclaimer)}"> --> + +
-
-
+ +
- - - + + + @@ -303,17 +300,13 @@ - -
-

- -

-
+ or !empty termsOfUseAndAccess.studyCompletion)}"> --> + +
-
- - + + diff --git a/src/main/webapp/dataset.xhtml b/src/main/webapp/dataset.xhtml index 78955d3ddae..ce887ccd094 100755 --- a/src/main/webapp/dataset.xhtml +++ b/src/main/webapp/dataset.xhtml @@ -24,16 +24,21 @@ - + - + + + + @@ -136,7 +141,7 @@ -