Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike 5000 queries round deux #1039

Merged
merged 132 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from 117 commits
Commits
Show all changes
132 commits
Select commit Hold shift + click to select a range
3b87901
Store position in export
epugh Jun 24, 2024
1a834ff
nicer message
epugh Jun 24, 2024
9765a8f
better temp data
epugh Jun 24, 2024
49267a1
remove debugging
epugh Jun 24, 2024
be2961c
Merge branch 'main' into spike_5000_queries_round_deux
epugh Jun 26, 2024
f916fb1
started the query_runner
epugh Jun 26, 2024
e4b8f2d
Merge branch 'main' into spike_5000_queries_round_deux
epugh Jul 1, 2024
ae6ba9f
good progress on TDD based approach to the fetch service.
epugh Jul 5, 2024
7d4e7d4
rubocop
epugh Jul 5, 2024
29a9a14
Now storing the response status and body text. Practiced using the …
epugh Jul 9, 2024
a69b32b
Use new sidekiq, which better supports long jobs!
epugh Jul 11, 2024
19e4f82
we had a reason for why this was the way it was!
epugh Jul 11, 2024
cc49698
I hope this is good stuff
epugh Sep 10, 2024
de37878
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 7, 2024
5475178
Nicer messaging of what to do next..
epugh Dec 7, 2024
01564c7
Add to the example some docs on this property
epugh Dec 7, 2024
4b37544
Finish work to show all the jobs that have a status
epugh Dec 7, 2024
06920e3
Fix image path
epugh Dec 7, 2024
c2ac549
Fix set up for production
epugh Dec 7, 2024
ec4e4fa
Bump solid_cable to work better in dev mode
epugh Dec 7, 2024
50c514f
Lots of changes
epugh Dec 7, 2024
ce41533
More robust handling as we cahnge Faraday gems
epugh Dec 7, 2024
e91dcf9
Focus the test
epugh Dec 7, 2024
c72ba69
Now running some real queries!
epugh Dec 7, 2024
a5afbf5
lint
epugh Dec 7, 2024
5a3ba41
Let's see how this goes...!
epugh Dec 9, 2024
01f74ab
try this..
epugh Dec 9, 2024
73e6aa3
argh
epugh Dec 9, 2024
d11e308
something werid...
epugh Dec 9, 2024
6ed0312
once more..
epugh Dec 9, 2024
0d8bbe5
lets do it every hour for now..
epugh Dec 9, 2024
bc7ee0f
maybe?
epugh Dec 9, 2024
7e433a8
Lets not have to migrate by hand!
epugh Dec 10, 2024
0f68a02
fix the sorting of the try
epugh Dec 10, 2024
3f5694c
check if blank
epugh Dec 10, 2024
1c544aa
tweak docs
epugh Dec 10, 2024
c4fcac0
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 11, 2024
3d87b60
Add timestamp to snapshot name with just date, remove some dead code.
epugh Dec 11, 2024
f956143
bump to lastest
epugh Dec 11, 2024
ec87950
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 11, 2024
de534a8
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 12, 2024
da2fbc4
Update test to match logic change.
epugh Dec 12, 2024
47180da
make method NOT private
epugh Dec 12, 2024
b66d1af
chang
epugh Dec 12, 2024
43a9215
does this change memory profile>
epugh Dec 12, 2024
8cae929
Trialing converting raw solr output into snapshot docs...
epugh Dec 12, 2024
b07a55b
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 16, 2024
abd97bc
more explicit directions (and a nightly)
epugh Dec 16, 2024
31d7e3b
let user specify nightly cases
epugh Dec 16, 2024
cda673a
lint
epugh Dec 17, 2024
f249570
be smarter about when a reload is needed, so we don't reload 50,000 t…
epugh Dec 17, 2024
e1de561
nightly dammit
epugh Dec 17, 2024
40e3cf1
Fix up styling and logic around this option
epugh Dec 17, 2024
75589ae
fix the pattern
epugh Dec 17, 2024
5792a5c
lint
epugh Dec 17, 2024
7da812e
ensure we have json for explain
epugh Dec 17, 2024
f17dcec
Nicer handling of the repeat icon
epugh Dec 17, 2024
e95c485
don't need --service-ports
epugh Dec 17, 2024
a3ed685
Testing code that shouldn't have been committed
epugh Dec 18, 2024
a66f9dc
Actually Running a Javascript scorer!
epugh Dec 18, 2024
b89250f
try and shrink slug
epugh Dec 18, 2024
2977bb4
comments
epugh Dec 18, 2024
51e4910
can we get under 500 mb?
epugh Dec 18, 2024
a3d9f02
lets try this
epugh Dec 18, 2024
d63ef34
okay, that didn't owrk...
epugh Dec 18, 2024
49a9064
argh
epugh Dec 18, 2024
81dbf13
Fix up handling some errors..
epugh Dec 18, 2024
a0c41d3
rubocop
epugh Dec 18, 2024
a450ef0
refacotring
epugh Dec 18, 2024
11dfb5d
MORE OFTEN
epugh Dec 18, 2024
a887ee6
be robust on checking for a user now that it isnt required
epugh Dec 18, 2024
4dcb571
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 20, 2024
4ccfd74
RunCaseJob is better name
epugh Dec 20, 2024
9a16981
actually persist hte change
epugh Dec 20, 2024
a8172a9
Dead code
epugh Dec 20, 2024
a191c89
Move the heavy web response to it's own table
epugh Dec 20, 2024
bfe4ff0
part of job renaming
epugh Dec 20, 2024
627e599
whtiespace
epugh Dec 20, 2024
0fecbd0
lint
epugh Dec 20, 2024
879b7d7
lets move from 4999 to 5000 so i dont think it broke or lost a doc
epugh Dec 21, 2024
68b4ac6
user_id is no longer required, but a try is!
epugh Dec 23, 2024
521e49e
move towards evaling from code...
epugh Dec 23, 2024
4710008
Use correct class name...
epugh Dec 24, 2024
e554bd6
Now read in p@10 logic in our tests
epugh Dec 27, 2024
68e77fc
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 27, 2024
eedb206
Fix the new name
epugh Dec 27, 2024
51dc7b0
fix up test
epugh Dec 27, 2024
34cbf54
Attempt to reduce logging
epugh Dec 27, 2024
ff73545
Revert "Attempt to reduce logging"
epugh Dec 27, 2024
ff47e82
less frequent
epugh Dec 27, 2024
359b76e
Revert "fix up test"
epugh Dec 27, 2024
43f6f4c
Reduce logging, clean ups
epugh Dec 27, 2024
433640f
lint
epugh Dec 27, 2024
a297ac6
Remove no longer needed js file
epugh Dec 28, 2024
ea97b5f
AP@10 now working
epugh Dec 28, 2024
097568e
provide real list of best docs
epugh Dec 28, 2024
7195aa6
Now run all communal scorers successfully!
epugh Dec 28, 2024
8ba6618
Does this deal with instantiations?
epugh Dec 28, 2024
53850b4
smarter handling of json
epugh Dec 28, 2024
e5a9ace
lint
epugh Dec 28, 2024
b3d598d
lets test
epugh Dec 28, 2024
5e05e0b
Reduce the insane logging!
epugh Dec 28, 2024
98ce8ac
Deal with NaN scores
epugh Dec 28, 2024
51b69a4
once more
epugh Dec 28, 2024
992ba1c
argh
epugh Dec 28, 2024
e606196
pages is kicking in when we don't want it.. today we only have cookies
epugh Dec 28, 2024
74cb6c0
Constrain what pages is looking up to make sure it doesn't interfere …
epugh Dec 28, 2024
3b91266
Deal with NaN from AP scorer
epugh Dec 28, 2024
ba9cc97
add rows and field query specification for Solr queries
epugh Dec 30, 2024
7a16cb9
richer handling of field spec
epugh Dec 30, 2024
41619aa
Average the query scores for the case Score
epugh Dec 30, 2024
4dea2bb
Prettier formatting of the case title with nightly icon
epugh Dec 30, 2024
d955d9a
skip the all_rated
epugh Dec 30, 2024
ae4b093
clean up a few lines
epugh Dec 30, 2024
5264f9a
Merge branch 'main' into spike_5000_queries_round_deux
epugh Dec 31, 2024
e64f673
does this matter?
epugh Dec 31, 2024
75acfd2
Fix eachDoc() scorer and relocate the class to lib dir
epugh Dec 31, 2024
aa13ce4
Lets start our tries with 1, not 0.
epugh Dec 31, 2024
52fd30c
Provide additional fields for scorers to access
epugh Dec 31, 2024
426bcbb
Frustrating attempt to limit simultanous bulk_processing jbos
epugh Dec 31, 2024
52b247f
be clear that this is a case... we know it's quepid!
epugh Dec 31, 2024
e2324e2
we always want the try number to start with 1
epugh Dec 31, 2024
ac773d3
One more update for tries starting at 1
epugh Dec 31, 2024
6258bb4
See if this works... Will know Jan 3rd!
epugh Jan 2, 2025
c9eaf91
Fix the query params picking so Solr doesn't blow up
epugh Jan 5, 2025
f4e24cb
Merge branch 'main' into spike_5000_queries_round_deux
epugh Jan 6, 2025
661f712
mention workaround
epugh Jan 7, 2025
1c310b4
Merge branch 'main' into spike_5000_queries_round_deux
epugh Jan 16, 2025
6a154c4
Strip out extra web reqeusts beyond the first snapshot to maintain size
epugh Jan 16, 2025
1ec30d5
clean up
epugh Jan 16, 2025
fc479b9
enhance test in case...
epugh Jan 16, 2025
1211c79
lint
epugh Jan 16, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ SOLID_CABLE_POLLING=0.1.seconds
# /case page.
FORCE_SSL=false

# This makes Quepid show detailed error messages in the UI instead of a generic 500 page,
# useful while testing a deployment in Production.
QUEPID_CONSIDER_ALL_REQUESTS_LOCAL=false

DB_HOST=mysql
DB_USERNAME=root
DB_PASSWORD=password
Expand Down
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,5 @@ group :test do
gem 'capybara'
gem 'selenium-webdriver'
end

gem 'mini_racer', '~> 0.16.0'
5 changes: 5 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,8 @@ GEM
childprocess (~> 5.0)
letter_opener (1.10.0)
launchy (>= 2.2, < 4)
libv8-node (18.19.0.0-x86_64-darwin)
libv8-node (18.19.0.0-x86_64-linux)
listen (3.9.0)
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
Expand All @@ -271,6 +273,8 @@ GEM
memory_profiler (1.1.0)
mini_histogram (0.3.1)
mini_mime (1.1.5)
mini_racer (0.16.0)
libv8-node (~> 18.19.0.0)
minitest (5.25.4)
minitest-reporters (1.7.1)
ansi
Expand Down Expand Up @@ -578,6 +582,7 @@ DEPENDENCIES
listen (~> 3.3)
local_time
memory_profiler
mini_racer (~> 0.16.0)
minitest-reporters (>= 0.5.0)
mission_control-jobs (~> 0.5.0)
mocha (~> 2.7)
Expand Down
3 changes: 2 additions & 1 deletion Procfile
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
web: bundle exec puma -C config/puma.rb
worker: bundle exec rake solid_queue:start
# worker: bundle exec bin/jobs
worker: bundle exec rake solid_queue:start # per https://github.com/rails/solid_queue/issues/405
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,9 +272,9 @@ bin/docker r bundle exec derailed bundle:mem

### Debugging JS

While running the application, you can debug the javascript using your favorite tool, the way you've always done it.
While running the application, you can debug the JavaScript using your favorite tool, the way you've always done it.

The javascript files will be concatenated into one file, using the rails asset pipeline.
The JavaScript files will be concatenated into one file, using the rails asset pipeline.

You can turn that off by toggling the following flag in `config/environments/development.rb`:

Expand Down
3 changes: 2 additions & 1 deletion app.json
Original file line number Diff line number Diff line change
Expand Up @@ -83,13 +83,14 @@
"worker": {
"quantity": 1,
"size": "standard-1x",
"command": "bundle exec rake solid_queue:start"
"command": "bundle exec bin/jobs"
},
"web": {
"quantity": 1,
"size": "standard-1x"
}
},
"scripts": {
"postdeploy": "bundle exec rake db:migrate"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
ng-click="ctrl.goToCase()"
>
{{ ctrl.thisCase.caseName }}

<i class="bi bi-repeat" ng-show="ctrl.thisCase.nightly"></i>
</span>
<br/>
<span class="item-actions">
Expand Down
2 changes: 1 addition & 1 deletion app/assets/javascripts/components/export_case/_modal.html
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ <h3 class="modal-title">Export Case: <span class="modal-case">{{ ctrl.theCase.ca
Detailed export is only supported from the individual Case view.
</p>
<span class="help-block">
CSV file with <code>Team Name,Case Name,Case ID,Query Text,Doc ID,Title,Rating,Field1,...,FieldN</code> where <code>Field1,...,FieldN</code> are specified under <strong>Settings</strong> in the <strong>Displayed Fields</strong> field.
CSV file with <code>Team Name,Case Name,Case ID,Query Text,Doc ID,Position,Title,Rating,Field1,...,FieldN</code> where <code>Field1,...,FieldN</code> are specified under <strong>Settings</strong> in the <strong>Displayed Fields</strong> field.
</span>
</div>

Expand Down
5 changes: 5 additions & 0 deletions app/assets/javascripts/controllers/case.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ angular.module('QuepidApp')
$scope.caseModel.reorderEnabled = false;
$scope.scores = [];
$scope.theCase = caseSvc.getSelectedCase();

$scope.updateNightly = function () {
caseSvc.updateNightly($scope.theCase);
};

$scope.caseName = {
name: null,
startRename: false,
Expand Down
31 changes: 29 additions & 2 deletions app/assets/javascripts/factories/ScorerFactory.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
ScorerFactory
]);

// This file contains JavaScript logic that supports running the Scorers on the client side.
// Many of the methods are duplicated in lib/scorer_logic.js to support running Scorers on the server side,
// so be aware if you make any changes.
function ScorerFactory($q, $timeout) {
var Scorer = function(data) {
var self = this;
Expand Down Expand Up @@ -63,10 +66,12 @@


// public functions
// NOT in scorer_logic.js
function getColors() {
return scaleToColors(self.scale);
}

// NOT in scorer_logic.js
function scaleToArray(string) {
return string.replace(/^\s+|\s+$/g,'')
.split(/\s*,\s*/)
Expand All @@ -75,6 +80,7 @@
});
}

// NOT in scorer_logic.js
function scaleToColors (scale) {
var colorMap = {};

Expand Down Expand Up @@ -110,6 +116,7 @@
return colorMap;
}

// NOT in scorer_logic.js
function scaleToScaleWithLabels(scale, scaleWithLabels) {
if ( angular.isUndefined(scaleWithLabels) || scaleWithLabels === null ) {
scaleWithLabels = {};
Expand All @@ -128,6 +135,7 @@
return scaleWithLabels;
}

// NOT in scorer_logic.js
function setDisplayName(name, communal) {
if ( communal === true ) {
return name + ' (Communal)';
Expand All @@ -136,13 +144,15 @@
}
}

// NOT in scorer_logic.js
function showScaleLabel(value) {
return self.showScaleLabels === true &&
self.scaleWithLabels !== null &&
angular.isDefined(self.scaleWithLabels) &&
angular.isDefined(self.scaleWithLabels[value]);
}

// NOT in scorer_logic.js
function teamNames() {
var teams = [];
angular.forEach(self.teams, function(team) {
Expand All @@ -152,6 +162,7 @@
return self.teamName || teams.join(', ');
}

// NOT in scorer_logic.js
function baseAvg(docs, count) {
var sum = 0.0;
var docsRated = 0;
Expand All @@ -177,11 +188,13 @@
}
}

// NOT in scorer_logic.js
function baseAvgRounded(docs, count) {
var avg = self.baseAvg(docs, count);
return Math.floor(avg);
}

// NOT in scorer_logic.js
function avg100(docs, count) {
var max = self.scale[self.scale.length -1];
var multiplier = 100 / max;
Expand All @@ -194,6 +207,7 @@
}
}

// NOT in scorer_logic.js
function editDistance(str1, str2) {

var makeZeroArr = function(len) {
Expand Down Expand Up @@ -240,6 +254,7 @@
return bestDocsRatings;
}

// NOT in scorer_logic.js
function distanceFromBest(docs, bestDocs, count) {
if ( angular.isUndefined(count) ) {
count = DEFAULT_NUM_DOCS;
Expand Down Expand Up @@ -301,6 +316,7 @@
// We could not get this to work/test, and spent too much time on it.
// Leaving it here until we do figure out
// -YC
// NOT in scorer_logic.js
function checkCodeExecutionTime() {
return $q(function(resolve, reject) {
var myWorker = new Worker('scripts/scorerEvalTest.js');
Expand All @@ -320,6 +336,7 @@
});
}

// NOT in scorer_logic.js
function checkCode() {
var deferred = $q.defer();
var loopPromise = hasLoop();
Expand All @@ -336,6 +353,7 @@
}

// mode may no longer be used.. maybe it was for unit test style scorers?
// NOT in scorer_logic.js
function runCode(query, total, docs, bestDocs, mode, options) {
var scale = self.scale;
var max = scale[scale.length-1];
Expand Down Expand Up @@ -423,12 +441,12 @@
return baseAvg(docs, count);
};

// Used in the original v1 scorer, which were replaced by the
// classic search relevance metrics @P etc.
// NOT in scorer_logic.js
var avgRating100 = function(count) {
return avg100(docs, count);
};

// NOT in scorer_logic.js
var editDistanceFromBest = function(count) {
return distanceFromBest(docs, bestDocs, count);
};
Expand Down Expand Up @@ -459,6 +477,8 @@
}
};

// may not be called
// NOT in scorer_logic.js
var refreshRatedDocs = function(k) {
return query.refreshRatedDocs(k);
};
Expand All @@ -468,6 +488,8 @@
// param that is passed, and calls the callback function on
// each doc.
// Even those that are not in the top 10 current.
// may not be used?
// NOT in scorer_logic.js
//
// @param score, Int
// @param f, Callback
Expand Down Expand Up @@ -503,16 +525,19 @@
}
};

// NOT in scorer_logic.js
var recordDepthOfRanking = function (k){
query.depthOfRating = k;
self.depthOfRating = k;
};

// NOT in scorer_logic.js
/*jshint unused:false */
function pass() {
scorerDeferred.resolve(100);
}

// NOT in scorer_logic.js
function fail() {
scorerDeferred.reject(0);
}
Expand All @@ -521,12 +546,14 @@
scorerDeferred.resolve(score);
}

// NOT in scorer_logic.js
function assert(cond) {
if (!cond) {
fail();
}
}

// NOT in scorer_logic.js
function assertOrScore(cond, score) {
if (!cond) {
setScore(score);
Expand Down
26 changes: 4 additions & 22 deletions app/assets/javascripts/factories/snapshotFactory.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,13 @@
angular.module('QuepidApp')
.factory('SnapshotFactory', [
'$log',
'$filter',
'docCacheSvc',
'normalDocsSvc',
SnapshotFactory
]);

function SnapshotFactory($log, docCacheSvc, normalDocsSvc) {
function SnapshotFactory($log, $filter, docCacheSvc, normalDocsSvc) {
var Snapshot = function(params) {
var self = this;

Expand All @@ -24,8 +25,7 @@

self.allDocIds = allDocIds;
self.getSearchResults = getSearchResults;
self.timestamp = timestamp;


self.docIdsPerQuery = {};

// Map from snake_case to camelCase.
Expand All @@ -45,7 +45,7 @@
});

function snapshotName () {
return 'Snapshot: ' + params.name;
return '(' + $filter('date')(params.time, 'shortDate') + ') ' + params.name;
}

function allDocIds () {
Expand Down Expand Up @@ -97,24 +97,6 @@

return searchResults;
}

function timestamp () {
var date = new Date(self.time * 1000);

var hour = date.getHours();
var minutes = date.getMinutes();
var year = date.getFullYear();
var day = date.getDate();

var months = ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'];
var month = months[date.getMonth()];

if (minutes < 10) {
minutes = '0' + ('' + minutes);
}

return day + '-' + month + '-' + year + ' ' + hour + ':' + minutes;
}
};

// Return factory object
Expand Down
4 changes: 3 additions & 1 deletion app/assets/javascripts/services/caseCSVSvc.js
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@
'Case ID',
'Query Text',
'Doc ID',
'Doc Position',
'Title',
'Rating',
];
Expand Down Expand Up @@ -353,7 +354,7 @@
csvContent += dataString + EOL;
}
else {
angular.forEach(docs, function (doc) {
angular.forEach(docs, function (doc, index) {
var dataString;
var infoArray = [];

Expand All @@ -362,6 +363,7 @@
infoArray.push(stringifyField(aCase.lastScore.case_id));
infoArray.push(stringifyField(query.queryText));
infoArray.push(stringifyField(doc.id));
infoArray.push(stringifyField(index+1));
infoArray.push(stringifyField(doc.title));
infoArray.push(stringifyField(doc.getRating()));

Expand Down
Loading
Loading