This code is a Python Custom Skill, for Azure Cognitive Search, based on Azure Functions for Python. Using the Cosmos DB Library, It inserts the input data as elements into a Cosmos DB collection.
- Follow this tutorial.
- Create one instance of the Content Moderator API in the Azure Portal. You will need to add the access key to the py file of the step below.
- Use the Python code below as your init.py file. Customize it with your storage account details, also with your csv file name and target column. As you can see below, my sample csv file target column name is Term. That helps the idea that this code will extract pre-defined terms from the documents content.
- Don't forget to add azure.functions to your requirements.txt file.
- Connect your published custom skill to your Cognitive Search Enrichment Pipeline. Plesae check the section below the code in this file. For more information, click here.
The Python code for this skill is here. Please take a minute to read all comments witin the code, where many details and contraints are detailed.
Let's assume that organizations were extracted with the Entity Extraction Built-In skill.
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "cosmosdb-writer",
"description": "write the data into a Cosmos DB Collection",
"context": "/document",
"uri": "your-Pyhton-Azure-Functions-published-URL",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 1,
"degreeOfParallelism": null,
"inputs": [
{
"name": "text",
"source": "/document/organizations"
}
],
"outputs": [
{
"name": "text",
"targetName": "insertResultStatus"
}
],
"httpHeaders": {}
}
Use the JSON input below to test your function. Get familiar with the code behavior in the different situations.
The test is a tribute to the most popular football club in the world, Flamengo, from Rio de Janeiro. It was founded in 1895 and has over 45 million fans in Brazil alone. The team was champion in its two most important matches of 2019, the Brazilian championship and the Copa Libertadores of America.
{
"values": [
{
"recordId": "0",
"data":
{
"text": ["FLAMENGO","VASCO","FLAMENGO","FLUMINENSE","FLAMENGO"]
}
} ,
{
"recordId": "1",
"data":
{
"text": [""]
}
} ,
{
"recordId": "2",
"data":
{
"text": ["FLAMENGO","Flamengo","flamengo","FLAMENGO"]
}
}
]
}
Empty string will be inserted as blank.
{
"values": [{
"recordId": "0",
"data": {
"text": "OK"
}
}, {
"recordId": "1",
"data": {
"text": "OK"
}
}, {
"recordId": "2",
"data": {
"text": "OK"
}
}]
}