Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added partially extracted slots support for the GroupSlots #394

Merged
merged 25 commits into from
Nov 7, 2024

Conversation

NotBioWaste905
Copy link
Collaborator

@NotBioWaste905 NotBioWaste905 commented Oct 1, 2024

Description

Added flag allow_partially_extracted to the GroupSlot class constructor. If True non extracted slots from Group Slot would not be overwritten with default values.

Checklist

  • Manual testing
  • Autotesting
  • Linting
  • Tutorials
  • Documentation

@NotBioWaste905 NotBioWaste905 requested a review from RLKRo October 1, 2024 12:42
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears this PR is a release PR (change its base from master if that is not the case).

Here's a release checklist:

  • Update package version
  • Update poetry.lock
  • Change PR merge option
  • Update template repo
  • Search for objects to be deprecated
  • Test parts not covered with pytest:
    • web_api tutorials
    • Test integrations with external services (telegram; stats)

@NotBioWaste905 NotBioWaste905 changed the base branch from master to dev October 1, 2024 12:43
@NotBioWaste905
Copy link
Collaborator Author

llm_response must inherit from BaseResponse

@RLKRo RLKRo added this to the Release v0.10 milestone Oct 2, 2024
@RLKRo
Copy link
Member

RLKRo commented Oct 24, 2024

I don't like name regex in this tutorial.
The last request is very unintuitive ("Mike " with a space at the end).
I think it's better to use other types of data for tutorials that are more easily extracted.

For example, here I use email and bitcoin address which both have fairly unique regex (i.e. unlikely false positives -> more options with requests):
https://github.com/RLKRo/chatsky-ac-tests/blob/main/slot/test_data.json

(regex for these can be found here; although email regex is pretty long, this might be better)

Also, I don't think that success_only was really shown in the script and happy path.
It should show how extraction with success_only=False looks like (e.g. only pass email for person extraction and it should return Your username is None. Your email is None.).

@NotBioWaste905
Copy link
Collaborator Author

NotBioWaste905 commented Oct 30, 2024

Consider adding way of defining minimal slots required for the group extraction to succeed
cnd.GroupSlotExtracted("person", required=["email"])

@RLKRo RLKRo merged commit e5e286c into dev Nov 7, 2024
17 checks passed
@RLKRo RLKRo deleted the feat/slots_extraction_update branch November 7, 2024 12:55
rock-n-shrimproll pushed a commit that referenced this pull request Feb 12, 2025
# Description

Added flag `allow_partial_extraction` to the `GroupSlot` class
constructor. If `True`, group slot only saves successfully extracted sub-slots.

---------

Co-authored-by: Roman Zlobin <RLKRo@proton.me>
@RLKRo RLKRo mentioned this pull request Feb 18, 2025
RLKRo added a commit that referenced this pull request Feb 18, 2025
# Changelog

## Breaking Changes

- Dropped support for python 3.8; added support for python 3.12 (#400);
- Reworked DB architecture to support partials turn reads/writes (#93).
  Old Context storages are incompatible with the new ones.
  See tutorial Context Storages: 8 for more info;
- `Context.labels`, `Context.requests`, `Context.responses` are now only
lazily loaded (#93).
  Items from older turns can be loaded on demand.
  Their `__getitem__` and `get` methods are now async.


## Features 

- Added `LLMResponse` and `LLMCondition` classes that allow using LLMs
(#376).
See the new LLM Integration tutorials and LLM user guide for more info;
- Added option to extract group slots partially (#394).
  See tutorial Slots: 2 for more information;
- `Message.original_message` is replaced with `Message.origin` which
stores both
the original message and the interface from which the message originated
(#398);
- Added `Context.current_turn_id` field which stores the number of the
current turn (#93);
- Added `Context.created_at`, `Context.updated_at` timestamp fields
(#93);
- Added `Context.turns` property which allows iterating over
requests/labels/responses by their turn ids (#93);
- `Context.labels`, `Context.requests`, `Context.responses` now support
slicing (#93).
`__getitem__`, `__setitem__` and `__delitem__` methods can now accept
slices of turn ids in addition to single turn id.
`get` method can now accepts iterable of turn ids in addition to single
turn id.


## Documentation

- Documentation is now versioned (#346, #409).
You can select preferred version via the drop-down menu in the top-right
corner.


## Developer changes

- Context now has field `origin_interface` to store name of the
interface that created it (#398);
- Added script `docs_no_docker` for building documentation without
docker (ef11ff9);
- Added in-RAM context storage to be the default one instead of a plain
dict (#93);
- Removed methods `Context.add_request`, `Context.add_label` and
`Context.add_response` (#93).
  Use setters with `Context.current_turn_id` instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants