Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Documents as first class citizens #504

Merged
merged 18 commits into from
Dec 31, 2024
Merged

Conversation

timonv
Copy link
Member

@timonv timonv commented Dec 27, 2024

For simple RAG, just adding the content of a retrieved document might be enough. However, in more complex use cases, you might want to add metadata as well, as is or for conditional formatting.

For instance, when dealing with large amounts of chunked code, providing the path goes a long way. If generated metadata is good enough, could be useful as well.

With this retrieved Documents are treated as first class citizens. Additionally, this also paves the way for multi retrieval (and multi modal).

Documents can be formatted with tera in the answering step.

@timonv timonv marked this pull request as draft December 27, 2024 10:46
@timonv timonv marked this pull request as ready for review December 29, 2024 15:00
@timonv timonv merged commit 235780b into master Dec 31, 2024
10 checks passed
@timonv timonv deleted the feat/query-docs-first-class branch December 31, 2024 11:39
timonv added a commit that referenced this pull request Jan 2, 2025
Reworks `PromptTemplate` to a more generic `Template`, such that they
can also be used elsewhere. This deprecates `PromptTemplate`.

As an example, an optional `Template` in the `Simple` answer
transformer, which can be used to customize the output of retrieved
documents. This has excellent synergy with the metadata changes in #504.
timonv pushed a commit that referenced this pull request Jan 4, 2025
## 🤖 New release
* `swiftide`: 0.15.0 -> 0.16.0 (✓ API compatible changes)
* `swiftide-agents`: 0.15.0 -> 0.16.0 (✓ API compatible changes)
* `swiftide-core`: 0.15.0 -> 0.16.0 (⚠️ API breaking changes)
* `swiftide-macros`: 0.15.0 -> 0.16.0
* `swiftide-integrations`: 0.15.0 -> 0.16.0 (✓ API compatible changes)
* `swiftide-indexing`: 0.15.0 -> 0.16.0 (✓ API compatible changes)
* `swiftide-query`: 0.15.0 -> 0.16.0 (✓ API compatible changes)

### ⚠️ `swiftide-core` breaking changes

```
--- failure enum_missing: pub enum removed or renamed ---

Description:
A publicly-visible enum cannot be imported by its prior path. A `pub use` may have been removed, or the enum itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.38.0/src/lints/enum_missing.ron

Failed in:
  enum swiftide_core::querying::states::AnsweredBuilderError, previously in file /tmp/.tmpfXNg04/swiftide-core/src/query.rs:202
  enum swiftide_core::querying::states::RetrievedBuilderError, previously in file /tmp/.tmpfXNg04/swiftide-core/src/query.rs:179

--- failure struct_missing: pub struct removed or renamed ---

Description:
A publicly-visible struct cannot be imported by its prior path. A `pub use` may have been removed, or the struct itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.38.0/src/lints/struct_missing.ron

Failed in:
  struct swiftide_core::querying::states::RetrievedBuilder, previously in file /tmp/.tmpfXNg04/swiftide-core/src/query.rs:179
  struct swiftide_core::querying::states::AnsweredBuilder, previously in file /tmp/.tmpfXNg04/swiftide-core/src/query.rs:202

--- failure trait_added_supertrait: non-sealed trait added new supertraits ---

Description:
A non-sealed trait added one or more supertraits, which breaks downstream implementations of the trait
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#generic-bounds-tighten
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.38.0/src/lints/trait_added_supertrait.ron

Failed in:
  trait swiftide_core::querying::QueryState gained Default in file /tmp/.tmp5vx3MP/swiftide/swiftide-core/src/query.rs:178

--- failure trait_no_longer_object_safe: trait no longer object safe ---

Description:
Trait is no longer object safe, which breaks `dyn Trait` usage.
        ref: https://doc.rust-lang.org/stable/reference/items/traits.html#object-safety
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.38.0/src/lints/trait_no_longer_object_safe.ron

Failed in:
  trait QueryState in file /tmp/.tmp5vx3MP/swiftide/swiftide-core/src/query.rs:178
```

<details><summary><i><b>Changelog</b></i></summary><p>

## `swiftide`
<blockquote>

##
[0.16.0](v0.15.0...v0.16.0)
- 2025-01-02

### New features

-
[52e341e](52e341e)
*(lancedb)* Public method for opening table (#514)

-
[3254bd3](3254bd3)
*(query)* Generic templates with document rendering (#520)

````text
Reworks `PromptTemplate` to a more generic `Template`, such that they
  can also be used elsewhere. This deprecates `PromptTemplate`.

  As an example, an optional `Template` in the `Simple` answer
  transformer, which can be used to customize the output of retrieved
  documents. This has excellent synergy with the metadata changes in #504.
````

-
[235780b](235780b)
*(query)* Documents as first class citizens (#504)

````text
For simple RAG, just adding the content of a retrieved document might be
  enough. However, in more complex use cases, you might want to add
  metadata as well, as is or for conditional formatting.

  For instance, when dealing with large amounts of chunked code, providing
  the path goes a long way. If generated metadata is good enough, could be
  useful as well.

  With this retrieved Documents are treated as first class citizens,
  including any metadata as well. Additionally, this also paves the way
  for multi retrieval (and multi modal).
````

-
[584695e](584695e)
*(query)* Add custom SQL query generation for pgvector search (#478)

````text
Adds support for custom retrieval queries with the sqlx query builder for PGVector. Puts down the fundamentals for custom query building for any retriever.

  ---------
````

-
[b55bf0b](b55bf0b)
*(redb)* Public database and table definition (#510)

-
[176378f](176378f)
Implement traits for all Arc dynamic dispatch (#513)

````text
If you use i.e. a `Persist` or a `NodeCache` outside swiftide as well, and you already have it Arc'ed, now it just works.
````

-
[dc9881e](dc9881e)
Allow opt out of pipeline debug truncation

### Bug fixes

-
[2831101](2831101)
*(lancedb)* Metadata should be nullable in lancedb (#515)

-
[c35df55](c35df55)
*(macros)* Explicit box dyn cast fixing Rust Analyzer troubles (#523)

### Miscellaneous

-
[1bbbb0e](1bbbb0e)
Clippy


**Full Changelog**:
0.15.0...0.16.0
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant