feat: search results - evaluation and results processing #249

gabikadlecova · 2025-02-11T09:44:41Z

Reference Issues/PRs

#230 + additional features in the search process

What does this implement/fix? Explain your changes.

implement parameter bins to search in a stratified way over the min-max param range
qol improvements in search
- add objective names to logging (instead of objective_1, objective_2)
- save the pareto front paths to a json
- allow different secondary objectives (parameters, flops, latency)
- test saving and loading of the extracted subnetwork config
fix devices for flops/latency to allow integration into the search
implement the evaluation workflow
- use lm_eval_harness to evaluate a litgpt (sub-)network checkpoint
- script for collecting the results in a given directory
- script for plotting the collected results

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

…f eval results.

…nable logging with objective names.

aaronkl · 2025-02-11T10:06:43Z

pyproject.toml

@@ -21,7 +21,8 @@ dependencies = [
  "torchvision>=0.18",
  "boto3==1.34.147",
  "botocore==1.34.147",
-  "deepspeed"


maybe let's revert this here and rather merge the main branch once #248 is in

aaronkl · 2025-02-11T10:14:40Z

whittle/search/search.py

+            "runtime": runtime[-1],
+        }
+
+        print(observation)


remove print

aaronkl · 2025-02-11T10:23:25Z

whittle/results/search/plot.py

This file seems somewhat specific to our paper. Should we have that in a separate repo?

aaronkl · 2025-02-11T10:24:35Z

whittle/results/__init__.py

This seems also specific to our experiments. Maybe move that in a separate repo?

aaronkl · 2025-02-11T10:25:32Z

whittle/args.py

+
+
+@dataclass
+class ParamBinArgs:


Shouldn't that be part of SearchArgs?

I can imagine that you'll use binning outside of search (e.g. for pre-selection of networks to evaluate). But I can add it to SearchArgs

aaronkl · 2025-02-11T13:54:06Z

whittle/search/search.py

@@ -19,6 +20,9 @@ def multi_objective_search(
    objective_kwargs: Optional[dict[str, Any]] = None,
    logger: Optional[Logger] = None,
    seed: Optional[int] = None,
+    param_bins: Optional[ParamBins] = None,


I am wondering if it wouldn't be cleaner to have this in the search method instead of the objective? We could have a something like stratified random search that samples sub-networks uniformly across parameter count instead of uniformly from the search space. Wdyt?

Yes, great idea

test/test_extract.py

whittle/evaluate_network.py

timurcarstensen · 2025-02-11T14:29:21Z

whittle/metrics/flops.py

@@ -15,6 +15,7 @@ def compute_flops(
    batch_size: int = 1,


same nitpick as above

timurcarstensen · 2025-02-11T14:29:48Z

whittle/metrics/latency.py

 import torch
 import torch.profiler
 from torch.profiler import record_function
+from typing import Optional



import sorting

timurcarstensen · 2025-02-11T14:30:18Z

whittle/results/search/collect_results.py

+import json
+import pandas as pd
+from pathlib import Path
+from tqdm import tqdm
+from typing import Optional, Any


import sorting

timurcarstensen · 2025-02-11T14:30:48Z

whittle/results/search/collect_results.py

+def setup(
+    results_dir: Path,
+    output_path: Optional[Path] = None,
+    pareto_path: Optional[Path] = None,
+) -> None:


same as above

timurcarstensen · 2025-02-11T14:31:02Z

whittle/results/search/plot.py

+import matplotlib.pyplot as plt
+import seaborn as sns
+import pandas as pd
+from pathlib import Path
+from typing import Optional


import sorting

…ction to a search strategy. Revert pyproject.toml

…n select subnetwork

timurcarstensen · 2025-02-13T17:37:21Z

test/test_search.py

+    bins, _ = (
+        param_bins(10, 2, 1) if search_strategy == "stratified_random_search" else None
+    )
+


Suggested change

bins, _ = (

param_bins(10, 2, 1) if search_strategy == "stratified_random_search" else None

)

bins, _ = (

param_bins(10, 2, 1)

if search_strategy == "stratified_random_search"

else (None, None)

)

else this will throw an error

… both litgpt-like checkpoints and whittle subnets. Fix bins in test.

timurcarstensen · 2025-02-14T10:52:00Z

whittle/evaluate_network.py

+        metrics["flops"] = compute_latency(model)
+    if measure_latency:
+        metrics["latency"] = compute_flops(
+            model, batch_size=latency_batch_size, previous_device=device
+        )


flops and latency functions are switched

Nice catch!

…ins to avoid inf loops

gabikadlecova added 4 commits February 5, 2025 17:31

Implement binning of nets by param count during search

f62a1d3

Add test for binning, fix an off-by-1 error. Formatting.

c09d528

Workflow for downstream eval. Save pareto front in search. Plotting o…

99efc41

…f eval results.

Modify search to use other objectives than val_loss and parameters. E…

da068f7

…nable logging with objective names.

gabikadlecova requested review from aaronkl and timurcarstensen February 11, 2025 09:44

gabikadlecova changed the title ~~Search results - evaluation and results processing~~ feat: Search results - evaluation and results processing Feb 11, 2025

gabikadlecova added 7 commits February 11, 2025 10:02

Formatting

168d35f

Add typing

698a5ac

Add model_config.yaml to the search checkpoint test. Add some typing

4eaef06

Test format

f046d7e

Fix perplexity calculation in search

80660ae

Fix perplexity computation

a9fe3eb

format

cad9b2a

aaronkl reviewed Feb 11, 2025

View reviewed changes

timurcarstensen reviewed Feb 11, 2025

View reviewed changes

test/test_extract.py Show resolved Hide resolved

timurcarstensen reviewed Feb 11, 2025

View reviewed changes

whittle/evaluate_network.py Show resolved Hide resolved

timurcarstensen reviewed Feb 11, 2025

View reviewed changes

whittle/metrics/flops.py

@@ -15,6 +15,7 @@ def compute_flops(

batch_size: int = 1,

Copy link

Collaborator

timurcarstensen Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nitpick as above

timurcarstensen reviewed Feb 11, 2025

View reviewed changes

gabikadlecova added 4 commits February 12, 2025 17:07

Remove plotting and results from the library. Convert param bins reje…

cf53449

…ction to a search strategy. Revert pyproject.toml

Merge changes from main

9a85591

Format, add annotations

077fd7f

Imports sortign

edb2071

gabikadlecova changed the title ~~feat: Search results - evaluation and results processing~~ feat: search results - evaluation and results processing Feb 12, 2025

gabikadlecova added 3 commits February 12, 2025 17:41

Import sorting tests

47b986c

Format

17fdb68

Fix stratified strategy test

8f5fc56

gabikadlecova added 6 commits February 12, 2025 18:07

Reformat test

d92ee72

Adapt the eval and search code to finetuned models. Allow head size i…

4858167

…n select subnetwork

Format

f4dbd67

Fix flops device. Add typing

6cfed4b

Fix names in docstrings

805761a

Merge branch 'main' into search-results-comparison

c843b3c

timurcarstensen reviewed Feb 13, 2025

View reviewed changes

gabikadlecova added 2 commits February 13, 2025 18:57

Modify search and eval so that the checkpoint loading is seamless for…

52b5555

… both litgpt-like checkpoints and whittle subnets. Fix bins in test.

Make ruff happy

ffd56ae

timurcarstensen reviewed Feb 14, 2025

View reviewed changes

gabikadlecova added 4 commits February 14, 2025 14:48

Fix search test and param bins so that it's not in an inf loop

37d03c5

Check for intersection instead of whole dir contents in checkpoint

06dfa1e

Swap latency and flops to be correct

b875c27

Modify how models are checkpointed in search. Add patience to param b…

3e3293a

…ins to avoid inf loops

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: search results - evaluation and results processing #249

feat: search results - evaluation and results processing #249

gabikadlecova commented Feb 11, 2025

aaronkl Feb 11, 2025

aaronkl Feb 11, 2025

aaronkl Feb 11, 2025

aaronkl Feb 11, 2025

aaronkl Feb 11, 2025

gabikadlecova Feb 11, 2025

aaronkl Feb 11, 2025

gabikadlecova Feb 11, 2025

timurcarstensen Feb 11, 2025

timurcarstensen Feb 11, 2025

timurcarstensen Feb 11, 2025

timurcarstensen Feb 11, 2025

timurcarstensen Feb 11, 2025

timurcarstensen Feb 13, 2025 •

edited

Loading

timurcarstensen Feb 13, 2025

timurcarstensen Feb 14, 2025

gabikadlecova Feb 14, 2025



		@dataclass
		class ParamBinArgs:

feat: search results - evaluation and results processing #249

Are you sure you want to change the base?

feat: search results - evaluation and results processing #249

Conversation

gabikadlecova commented Feb 11, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timurcarstensen Feb 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timurcarstensen Feb 13, 2025 •

edited

Loading