Skip to content

Commit

Permalink
Rename "vault" to "collection"
Browse files Browse the repository at this point in the history
  • Loading branch information
dluc committed Jul 28, 2023
1 parent 60dc01d commit f2082ed
Show file tree
Hide file tree
Showing 22 changed files with 47 additions and 47 deletions.
6 changes: 3 additions & 3 deletions COMMUNITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ see from the Semantic Memory. We do our best to respond to each submission.

## Public Semantic Kernel Community Office Hours

We regularly have Community Office Hours that are open to the **public** to join.
We regularly have Community Office Hours that are open to the **public** to join.

Add Semantic Kernel events to your calendar: download the
[calendar.ics](https://aka.ms/sk-community-calendar) file.
Expand All @@ -22,8 +22,8 @@ If you are unable to make it live, all meetings will be recorded and posted onli
## Join the conversation on Discord

We have a growing and active channel on Discord where you can get help, engage
in lively discussion, and share what you've built with Semantic Memory and
in lively discussion, and share what you've built with Semantic Memory and
Semantic Kernel!

Join our Discord:
[https://aka.ms/SKDiscord](https://aka.ms/SKDiscord)
[https://aka.ms/SKDiscord](https://aka.ms/SKDiscord)
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ If you want to give the service a quick test, use the following commands.
cd server/combinedservices-dotnet

# First time configuration, creates appsettings.Development.json
# You can skip this step if you have already configured the service.
# You can skip this step if you have already configured the service.
dotnet run setup

# Run the service with settings from appsettings.Development.json
Expand Down
2 changes: 1 addition & 1 deletion SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ message with our PGP key; please download it from the
You should receive a response within 24 hours. If for some reason you do not,
please follow up via email to ensure we received your original message.
Additional information can be found at
[microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
[microsoft.com/msrc](https://aka.ms/opensource/security/msrc).

Please include the requested information listed below (as much as you can
provide) to help us better understand the nature and scope of the possible issue:
Expand Down
4 changes: 2 additions & 2 deletions clients/curl/upload-file.sh
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,10 @@ exitScript() {
readParameters "$@"
validatePrameters

# Handle list of vault IDs
# Handle list of collection IDs
COLLECTIONS_FIELD=""
for x in $COLLECTIONS; do
COLLECTIONS_FIELD="${COLLECTIONS_FIELD} -F vaults=\"${x}\""
COLLECTIONS_FIELD="${COLLECTIONS_FIELD} -F collections=\"${x}\""
done

# Send HTTP request using curl
Expand Down
6 changes: 3 additions & 3 deletions clients/dotnet/MemoryWebClient/MemoryWebClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,11 @@ private async Task ImportFilesInternalAsync(string[] files, ImportFileOptions op
List<IDisposable> disposables = new();
formData.Add(requestIdContent, "requestId");
formData.Add(userContent, "user");
foreach (var vaultId1 in options.CollectionIds)
foreach (var collectionId in options.CollectionIds)
{
var content = new StringContent(vaultId1);
var content = new StringContent(collectionId);
disposables.Add(content);
formData.Add(content, "vaults");
formData.Add(content, "collections");
}

for (int index = 0; index < files.Length; index++)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ public static async Task RunAsync()
// Create sample pipeline with 4 files
Console.WriteLine("* Defining pipeline with 4 files...");
var pipeline = orchestrator
.PrepareNewFileUploadPipeline("inProcessTest", "userId", new[] { "vault1" })
.PrepareNewFileUploadPipeline("inProcessTest", "userId", new[] { "collection1" })
.AddUploadFile("file1", "file1.txt", "file1.txt")
.AddUploadFile("file2", "file2.txt", "file2.txt")
.AddUploadFile("file3", "file3.docx", "file3.docx")
Expand Down
4 changes: 2 additions & 2 deletions clients/samples/FileImportExamples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Before running the code, you will need some configuration step.

1. Copy `appsettings.json` to `appsettings.Development.json`
(you could edit the original file, just be careful not sending the edited
(you could edit the original file, just be careful not sending the edited
file to git/pull requests because it will contain personal settings and
potential secret credentials.)
2. Edit `appsettings.Development.json` and choose one embedding generator,
Expand Down Expand Up @@ -82,7 +82,7 @@ multiple files, with a fluent syntax:

```csharp
var pipeline = orchestrator
.PrepareNewFileUploadPipeline("inProcessTest", "userId", new[] { "vault1" })
.PrepareNewFileUploadPipeline("inProcessTest", "userId", new[] { "collection1" })
.AddUploadFile("file1", "file1.txt", "file1.txt")
.AddUploadFile("file2", "file2.txt", "file2.txt")
.AddUploadFile("file3", "file3.docx", "file3.docx")
Expand Down
2 changes: 1 addition & 1 deletion clients/samples/FileImportExamples/file1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Carbon is the 15th most abundant element in the Earth's crust, and the fourth mo

The atoms of carbon can bond together in diverse ways, resulting in various allotropes of carbon. Well-known allotropes include graphite, diamond, amorphous carbon, and fullerenes. The physical properties of carbon vary widely with the allotropic form. For example, graphite is opaque and black, while diamond is highly transparent. Graphite is soft enough to form a streak on paper (hence its name, from the Greek verb "γράφειν" which means "to write"), while diamond is the hardest naturally occurring material known. Graphite is a good electrical conductor while diamond has a low electrical conductivity. Under normal conditions, diamond, carbon nanotubes, and graphene have the highest thermal conductivities of all known materials. All carbon allotropes are solids under normal conditions, with graphite being the most thermodynamically stable form at standard temperature and pressure. They are chemically resistant and require high temperature to react even with oxygen.

The most common oxidation state of carbon in inorganic compounds is +4, while +2 is found in carbon monoxide and transition metal carbonyl complexes. The largest sources of inorganic carbon are limestones, dolomites and carbon dioxide, but significant quantities occur in organic deposits of coal, peat, oil, and methane clathrates. Carbon forms a vast number of compounds, with about two hundred million having been described and indexed;[19] and yet that number is but a fraction of the number of theoretically possible compounds under standard conditions.
The most common oxidation state of carbon in inorganic compounds is +4, while +2 is found in carbon monoxide and transition metal carbonyl complexes. The largest sources of inorganic carbon are limestones, dolomites and carbon dioxide, but significant quantities occur in organic deposits of coal, peat, oil, and methane clathrates. Carbon forms a vast number of compounds, with about two hundred million having been described and indexed;[19] and yet that number is but a fraction of the number of theoretically possible compounds under standard conditions.

The allotropes of carbon include graphite, one of the softest known substances, and diamond, the hardest naturally occurring substance. It bonds readily with other small atoms, including other carbon atoms, and is capable of forming multiple stable covalent bonds with suitable multivalent atoms. Carbon is a component element in the large majority of all chemical compounds, with about two hundred million examples having been described in the published chemical literature.[19] Carbon also has the highest sublimation point of all elements. At atmospheric pressure it has no melting point, as its triple point is at 10.8 ± 0.2 megapascals (106.6 ± 2.0 atm; 1,566 ± 29 psi) and 4,600 ± 300 K (4,330 ± 300 °C; 7,820 ± 540 °F),[3][4] so it sublimes at about 3,900 K (3,630 °C; 6,560 °F).[21][22] Graphite is much more reactive than diamond at standard conditions, despite being more thermodynamically stable, as its delocalised pi system is much more vulnerable to attack. For example, graphite can be oxidised by hot concentrated nitric acid at standard conditions to mellitic acid, C6(CO2H)6, which preserves the hexagonal units of graphite while breaking up the larger structure.[23]

Expand Down
2 changes: 1 addition & 1 deletion lib/dotnet/Core.NetStandard20/ImportFileOptions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ public void Validate()

if (this.CollectionIds.Count < 1)
{
throw new ArgumentNullException(nameof(this.CollectionIds), "The list of vaults is empty");
throw new ArgumentNullException(nameof(this.CollectionIds), "The list of collections is empty");
}
}
}
6 changes: 3 additions & 3 deletions lib/dotnet/Core/Handlers/SaveEmbeddingsHandler.cs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ public SaveEmbeddingsHandler(
/// <inheritdoc />
public async Task<(bool success, DataPipeline updatedPipeline)> InvokeAsync(DataPipeline pipeline, CancellationToken cancellationToken)
{
// For each embedding file => For each Vector DB => Store vector (vaults ==> tags)
// For each embedding file => For each Vector DB => Store vector (collections ==> tags)
foreach (var embeddingFile in pipeline.Files.SelectMany(x => x.GeneratedFiles.Where(f => f.Value.IsEmbeddingFile())))
{
foreach (object storageConfig in this._vectorDbs)
Expand All @@ -85,9 +85,9 @@ public SaveEmbeddingsHandler(
record.Tags.Add("file", embeddingFile.Value.ParentId);
record.Tags.Add("file_type", pipeline.GetFile(embeddingFile.Value.ParentId).Type);
record.Tags.Add("file_partition", embeddingFile.Value.Id);
foreach (var vault in pipeline.VaultIds)
foreach (var collectionId in pipeline.CollectionIds)
{
record.Tags.Add("collection", vault);
record.Tags.Add("collection", collectionId);
}

record.Metadata.Add("file_name", pipeline.GetFile(embeddingFile.Value.ParentId).Name);
Expand Down
8 changes: 4 additions & 4 deletions lib/dotnet/Core/MemoryStorage/MemoryRecord.cs
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ public class MemoryRecord

/// <summary>
/// Optional Searchable Key=Value tags (string => string[] collection)
///
///
/// Multiple values per keys are supported.
/// e.g. [ "Collection=Work", "Project=1", "Project=2", "Project=3", "Type=Chat", "LLM=AzureAda2" ]
///
///
/// Use cases:
/// * collections, e.g. [ "Collection=Project1", "Collection=Work" ]
/// * collections, e.g. [ "Collection=Project1", "Collection=Work" ]
/// * folders, e.g. [ "Folder=Inbox", "Folder=Spam" ]
/// * content types, e.g. [ "Type=Chat" ]
/// * versioning, e.g. [ "LLM=AzureAda2", "Schema=1.0" ]
Expand All @@ -46,7 +46,7 @@ public class MemoryRecord

/// <summary>
/// Optional Non-Searchable metadata processed client side.
///
///
/// Use cases:
/// * citations
/// * original text
Expand Down
8 changes: 4 additions & 4 deletions lib/dotnet/Core/Pipeline/BaseOrchestrator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -43,23 +43,23 @@ protected BaseOrchestrator(
public abstract Task RunPipelineAsync(DataPipeline pipeline, CancellationToken cancellationToken = default);

///<inheritdoc />
public DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> vaultIds)
public DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> collectionIds)
{
return this.PrepareNewFileUploadPipeline(id, userId, vaultIds, new List<IFormFile>());
return this.PrepareNewFileUploadPipeline(id, userId, collectionIds, new List<IFormFile>());
}

///<inheritdoc />
public DataPipeline PrepareNewFileUploadPipeline(
string id,
string userId,
IEnumerable<string> vaultIds,
IEnumerable<string> collectionIds,
IEnumerable<IFormFile> filesToUpload)
{
var pipeline = new DataPipeline
{
Id = id,
UserId = userId,
VaultIds = vaultIds.ToList(),
CollectionIds = collectionIds.ToList(),
Creation = DateTimeOffset.UtcNow,
LastUpdate = DateTimeOffset.UtcNow,
FilesToUpload = filesToUpload.ToList(),
Expand Down
6 changes: 3 additions & 3 deletions lib/dotnet/Core/Pipeline/DataPipeline.cs
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ public class FileDetails
public string Type { get; set; } = string.Empty;

/// <summary>
/// List of files generated of the main file
/// List of files generated of the main file
/// </summary>
[JsonPropertyOrder(4)]
[JsonPropertyName("generated_files")]
Expand Down Expand Up @@ -151,8 +151,8 @@ public string GetPartitionFileName(int partitionNumber)
public string UserId { get; set; } = string.Empty;

[JsonPropertyOrder(6)]
[JsonPropertyName("vaults")]
public List<string> VaultIds { get; set; } = new();
[JsonPropertyName("collections")]
public List<string> CollectionIds { get; set; } = new();

[JsonPropertyOrder(7)]
[JsonPropertyName("creation")]
Expand Down
8 changes: 4 additions & 4 deletions lib/dotnet/Core/Pipeline/IPipelineOrchestrator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -29,19 +29,19 @@ public interface IPipelineOrchestrator
/// </summary>
/// <param name="id">Id of the pipeline instance. This value will persist throughout the pipeline and final data lineage used for citations.</param>
/// <param name="userId">Primary user who the data belongs to. Other users, e.g. sharing, is not supported in the pipeline at this time.</param>
/// <param name="vaultIds">List of vaults where o store the semantic memory extracted from the files. E.g. "chat ID", "personal", etc.</param>
/// <param name="collectionIds">List of collections where to store the semantic memory extracted from the files. E.g. "chat ID", "personal", etc.</param>
/// <param name="filesToUpload">List of files provided before starting the pipeline, to be uploaded into the container before starting.</param>
/// <returns>Pipeline representation</returns>
DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> vaultIds, IEnumerable<IFormFile> filesToUpload);
DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> collectionIds, IEnumerable<IFormFile> filesToUpload);

/// <summary>
/// Create a new pipeline value object, with an empty list of files
/// </summary>
/// <param name="id">Id of the pipeline instance. This value will persist throughout the pipeline and final data lineage used for citations.</param>
/// <param name="userId">Primary user who the data belongs to. Other users, e.g. sharing, is not supported in the pipeline at this time.</param>
/// <param name="vaultIds">List of vaults where o store the semantic memory extracted from the files. E.g. "chat ID", "personal", etc.</param>
/// <param name="collectionIds">List of collections where to store the semantic memory extracted from the files. E.g. "chat ID", "personal", etc.</param>
/// <returns>Pipeline representation</returns>
DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> vaultIds);
DataPipeline PrepareNewFileUploadPipeline(string id, string userId, IEnumerable<string> collectionIds);

/// <summary>
/// Start a new data pipeline execution
Expand Down
14 changes: 7 additions & 7 deletions lib/dotnet/Core/WebService/UploadRequest.cs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ public class UploadRequest
{
public string RequestId { get; set; } = string.Empty;
public string UserId { get; set; } = string.Empty;
public IEnumerable<string> VaultIds { get; set; } = new List<string>();
public IEnumerable<string> CollectionIds { get; set; } = new List<string>();
public IEnumerable<IFormFile> Files { get; set; } = new List<IFormFile>();

/* Resources:
Expand All @@ -26,7 +26,7 @@ public class UploadRequest
public static async Task<(UploadRequest model, bool isValid, string errMsg)> BindHttpRequestAsync(HttpRequest httpRequest)
{
const string UserField = "user";
const string VaultsField = "vaults";
const string CollectionsField = "collections";
const string RequestIdField = "requestId";

var result = new UploadRequest();
Expand All @@ -52,11 +52,11 @@ public class UploadRequest
return (result, false, $"Invalid or missing user ID, '{UserField}' value empty or not found, or multiple values provided");
}

// At least one vault must be specified.Note: the pipeline might decide to ignore the specified vaults,
// i.e. custom pipelines can override/ignore this value, depending on the implementation chosen.
if (!form.TryGetValue(VaultsField, out StringValues vaultIds) || vaultIds.Count == 0 || vaultIds.Any(string.IsNullOrEmpty))
// At least one collection must be specified. Note: the pipeline might decide to ignore the specified collections,
// i.e. custom pipelines can override/ignore this value, depending on the implementation chosen.
if (!form.TryGetValue(CollectionsField, out StringValues collectionIds) || collectionIds.Count == 0 || collectionIds.Any(string.IsNullOrEmpty))
{
return (result, false, $"Invalid or missing vault ID, '{VaultsField}' list is empty or contains empty values");
return (result, false, $"Invalid or missing collection ID, '{CollectionsField}' list is empty or contains empty values");
}

if (form.TryGetValue(RequestIdField, out StringValues requestIds) && requestIds.Count > 1)
Expand All @@ -68,7 +68,7 @@ public class UploadRequest
result.RequestId = requestIds.FirstOrDefault() ?? DateTimeOffset.Now.ToString("yyyyMMdd.HHmmss.", CultureInfo.InvariantCulture) + Guid.NewGuid().ToString("N");

result.UserId = userIds[0]!;
result.VaultIds = vaultIds;
result.CollectionIds = collectionIds;
result.Files = form.Files;

return (result, true, string.Empty);
Expand Down
2 changes: 1 addition & 1 deletion server/combinedservices-dotnet/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@

// Define all the steps in the pipeline
var pipeline = orchestrator
.PrepareNewFileUploadPipeline(containerId, input.UserId, input.VaultIds, input.Files)
.PrepareNewFileUploadPipeline(containerId, input.UserId, input.CollectionIds, input.Files)
.Then("extract")
.Then("partition")
.Then("gen_embeddings")
Expand Down
2 changes: 1 addition & 1 deletion server/combinedservices-dotnet/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@
"Type": "AzureQueue",
// Used when Type == AzureQueue
"AzureQueue": {
// - AzureIdentity: use automatic AAD authentication mechanism
// - AzureIdentity: use automatic AAD authentication mechanism
// - ConnectionString: auth using a connection string
"Auth": "AzureIdentity",
// Azure Storage account name, required when using AzureIdentity auth
Expand Down
2 changes: 1 addition & 1 deletion server/pipelineservice-dotnet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The service depends on three main components:
Chroma and more.
* **Data ingestion orchestration**: this can run in memory and in the same
process, e.g. when working with small files, or run as a service, in which
case it requires persistent queues like Azure Queues or RabbitMQ.
case it requires persistent queues like Azure Queues or RabbitMQ.

**The pipeline service is designed to run in the background and in the cloud,
without direct interaction. We recommended using it with asynchronous queues
Expand Down
2 changes: 1 addition & 1 deletion server/pipelineservice-dotnet/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
"Type": "AzureQueue",
// Used when Type == AzureQueue
"AzureQueue": {
// - AzureIdentity: use automatic AAD authentication mechanism
// - AzureIdentity: use automatic AAD authentication mechanism
// - ConnectionString: auth using a connection string
"Auth": "AzureIdentity",
// Azure Storage account name, required when using AzureIdentity auth
Expand Down
2 changes: 1 addition & 1 deletion server/samples/CustomHandlerExample/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@
"Type": "AzureQueue",
// Used when Type == AzureQueue
"AzureQueue": {
// - AzureIdentity: use automatic AAD authentication mechanism
// - AzureIdentity: use automatic AAD authentication mechanism
// - ConnectionString: auth using a connection string
"Auth": "AzureIdentity",
// Azure Storage account name, required when using AzureIdentity auth
Expand Down
2 changes: 1 addition & 1 deletion server/webservice-dotnet/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@

// Define all the steps in the pipeline
var pipeline = orchestrator
.PrepareNewFileUploadPipeline(containerId, input.UserId, input.VaultIds, input.Files)
.PrepareNewFileUploadPipeline(containerId, input.UserId, input.CollectionIds, input.Files)
.Then("extract")
.Then("partition")
.Then("gen_embeddings")
Expand Down
2 changes: 1 addition & 1 deletion server/webservice-dotnet/appsettings.json
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@
"Type": "AzureQueue",
// Used when Type == AzureQueue
"AzureQueue": {
// - AzureIdentity: use automatic AAD authentication mechanism
// - AzureIdentity: use automatic AAD authentication mechanism
// - ConnectionString: auth using a connection string
"Auth": "AzureIdentity",
// Azure Storage account name, required when using AzureIdentity auth
Expand Down

0 comments on commit f2082ed

Please sign in to comment.