Skip to content

Commit

Permalink
Fix and document discovery queries behavior on distributed queries an…
Browse files Browse the repository at this point in the history
…d add tests (#7655)

This PR fixes and attempts to document the discovery queries behavior in distributed queries.

The main change:
If a distributed query does not have its corresponding discovery query, then it will always be executed.
(Currently on `master` this is not the case, if there's at least one discovery query then all those distributed queries that do not have one defined are not executed at all.)

This PR also fixes #5260 and #7793 and adds some tests to `Distributed.acceptWork`.
  • Loading branch information
lucasmrod authored Feb 17, 2023
1 parent 042cb99 commit cdadfd2
Show file tree
Hide file tree
Showing 5 changed files with 183 additions and 10 deletions.
66 changes: 66 additions & 0 deletions docs/wiki/deployment/remote.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,72 @@ As of osquery version 2.1.2, the distributed write API includes a top-level `sta
}
```

### Discovery queries on distributed queries

Distributed queries support "discovery queries", which are similar in semantics to [discovery queries in Packs](./configuration.md#discovery-queries).
A distributed discovery query controls whether or not a distributed query will be executed on a host.
- If a distributed query has no corresponding discovery query, then it is always executed on the host.
- If a discovery query returns one or more results, then its corresponding distributed query will be executed on the host.
- If a discovery query returns no results, then its corresponding distributed query will not be executed on the host.

Sample of a `distributed/read` response with discovery queries:
```json
{
"queries": {
"always_execute": "select version from osquery_info;",
"windows_info": "select name from system_info;",
"darwin_time": "select day from time;"
},
"discovery": {
"windows_info": "select * from os_version where platform='windows';",
"darwin_time": "select * from os_version where platform='darwin';"
}
}
```

When processing the above `distributed/read` response, osquery will:
- Always execute the `"always_execute"` query because there isn't a discovery query defined for it.
- Only execute `"windows_info"` query on Windows hosts.
- Only execute `"darwin_time"` on macOS hosts.

Here's the corresponding osquery `distributed/write` request on a Windows host:
```json
{
"node_key": "...",
"queries": {
"always_execute": [
{"version": "5.5.1"}
],
"windows_info": [
{"name": "windows"}
]
},
"statuses": {
"always_execute": 0,
"windows_info": 0
}
}
```

Here's the corresponding osquery `distributed/write` request on a macOS host:
```json
{
"node_key": "...",
"queries": {
"always_execute": [
{"version": "5.4.0"}
],
"darwin_time": [
{"day": "17"}
]
},
"statuses": {
"always_execute": 0,
"darwin_time": 0
}
}
```

## Customizations

There are several unlisted flags to further control the remote settings. These controls are helpful if using a somewhat opaque API.
Expand Down
2 changes: 1 addition & 1 deletion docs/wiki/development/unit-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ The above code is very simple. If you're unfamiliar with the syntax/concepts of
Each component of osquery you're working on has its own "CMakeLists.txt" file. For example, the _tables_ component (folder) has its own "CMakeLists.txt" file at [osquery/tables/CMakeLists.txt](https://github.com/osquery/osquery/blob/master/osquery/tables/CMakeLists.txt). The file that we're going to be modifying today is [osquery/CMakeLists.txt](https://github.com/osquery/osquery/tree/master/osquery/CMakeLists.txt). Edit that file to include the following content:
```CMake
ADD_OSQUERY_TEST(example_test example_test.cpp)
add_osquery_executable(example_test example_test.cpp)
```

After you specify the test sources, add whatever libraries you have to link against and properly set the compiler flags, make sure you call `ADD_TEST` with your unit test. This registers it with CTest (CMake's test runner).
Expand Down
15 changes: 9 additions & 6 deletions osquery/distributed/distributed.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -291,8 +291,9 @@ Status Distributed::acceptWork(const std::string& work) {
return Status(1, "Error Parsing JSON");
}

std::set<std::string> queries_to_run;
// Check for and run discovery queries first
// Check for and run discovery queries first.
// Store their result in discovery_results.
std::map<std::string, bool> discovery_results;
if (doc.doc().HasMember("discovery")) {
const auto& queries = doc.doc()["discovery"];
assert(queries.IsObject());
Expand All @@ -313,9 +314,7 @@ Status Distributed::acceptWork(const std::string& work) {
if (!sql.getStatus().ok()) {
return Status(1, "Distributed discovery query has an SQL error");
}
if (sql.rows().size() > 0) {
queries_to_run.insert(name);
}
discovery_results.insert({name, (sql.rows().size() > 0)});
}
}
}
Expand All @@ -336,7 +335,11 @@ Status Distributed::acceptWork(const std::string& work) {
return Status(1, "Distributed query is not a string");
}

if (queries_to_run.empty() || queries_to_run.count(name)) {
// If a query does not have a corresponding discovery query
// or it does and it returned results, then store the query
// for execution.
const auto result = discovery_results.find(name);
if (result == discovery_results.cend() || result->second) {
setDatabaseValue(kDistributedQueries, name, query);
}
}
Expand Down
11 changes: 11 additions & 0 deletions osquery/distributed/distributed.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,14 @@ class Distributed {
* Given a response from a distributed plugin, parse the results and enqueue
* them in the internal state of the class
*
* Following is the behavior with respect to "discovery" queries in "work":
* - If a query in "queries" has no corresponding query in "discovery",
* then the distributed query is enqueued.
* - If a discovery query in "discovery" returns one or more results,
* then its corresponding distributed query in "queries" is enqueued.
* - If a discovery query in "discovery" returns no results, then its
* corresponding distributed query in "queries" is not enqueued.
*
* @param work is the string from DistributedPlugin::getQueries
* @return a Status indicating the success or failure of the operation
*/
Expand Down Expand Up @@ -364,5 +372,8 @@ class Distributed {
FRIEND_TEST(DistributedTests, test_workflow);
FRIEND_TEST(DistributedTests, test_run_queries_with_denylisted_query);
FRIEND_TEST(DistributedTests, test_check_and_set_as_running);
FRIEND_TEST(DistributedTests, test_accept_work_basic);
FRIEND_TEST(DistributedTests, test_accept_work_with_discovery);
FRIEND_TEST(DistributedTests, test_accept_work_with_discovery_all_fail);
};
} // namespace osquery
99 changes: 96 additions & 3 deletions osquery/distributed/tests/distributed_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -198,15 +198,15 @@ TEST_F(DistributedTests, test_workflow) {
auto dist = Distributed();
auto s = dist.pullUpdates();
ASSERT_TRUE(s.ok()) << s.getMessage();
EXPECT_EQ(s.toString(), "OK");
EXPECT_EQ(s.getMessage(), "OK");

auto queries = dist.getPendingQueries();

EXPECT_EQ(queries.size(), 2U);
EXPECT_EQ(dist.results_.size(), 0U);
s = dist.runQueries();
ASSERT_TRUE(s.ok());
EXPECT_EQ(s.toString(), "OK");
ASSERT_TRUE(s.ok()) << s.getMessage();
EXPECT_EQ(s.getMessage(), "OK");

queries = dist.getPendingQueries();
EXPECT_EQ(queries.size(), 0U);
Expand Down Expand Up @@ -342,4 +342,97 @@ TEST_F(DistributedTests, test_run_queries_with_denylisted_query) {
ASSERT_FALSE(status.ok()); // NotFound
ASSERT_TRUE(ts2.empty());
}

TEST_F(DistributedTests, test_accept_work_basic) {
auto dist = Distributed();

const std::string work = R"json(
{
"queries": {
"q1": "SELECT * FROM system_info;",
"q2": "SELECT * FROM osquery_info;"
}
}
)json";
auto s = dist.acceptWork(work);
ASSERT_TRUE(s.ok()) << s.getMessage();
const auto queries = dist.getPendingQueries();
ASSERT_EQ(queries.size(), 2);
for (const auto& queryName : queries) {
ASSERT_FALSE(queryName.compare("q1") && queryName.compare("q2"))
<< queryName;
std::string actualQuery;
s = getDatabaseValue(kDistributedQueries, queryName, actualQuery);
ASSERT_TRUE(s.ok()) << s.getMessage();
std::string expectedQuery;
if (queryName.compare("q1") == 0) {
expectedQuery = "SELECT * FROM system_info;";
} else { // q2
expectedQuery = "SELECT * FROM osquery_info;";
}
EXPECT_EQ(actualQuery, expectedQuery);
}
}

TEST_F(DistributedTests, test_accept_work_with_discovery) {
auto dist = Distributed();

const std::string work = R"json(
{
"queries": {
"q1": "SELECT * FROM system_info;",
"q2": "SELECT * FROM time;",
"q3": "SELECT * FROM osquery_info;"
},
"discovery": {
"q1": "SELECT 1;",
"q2": "SELECT 1 WHERE false;"
}
}
)json";
auto s = dist.acceptWork(work);
ASSERT_TRUE(s.ok()) << s.getMessage();

// Query q1 has a discovery query with > 0 results.
// Query q2 has a discovery query with 0 results, thus it won't be executed.
// Query q3 does not have a discovery query, thus it will be executed.
const auto queryNames = dist.getPendingQueries();
ASSERT_EQ(queryNames.size(), 2);
for (const auto& queryName : queryNames) {
ASSERT_TRUE(queryName == "q1" || queryName == "q3");
std::string actualQuery;
s = getDatabaseValue(kDistributedQueries, queryName, actualQuery);
ASSERT_TRUE(s.ok()) << s.getMessage();
std::string expectedQuery = "SELECT * FROM system_info;";
if (queryName == "q3") {
expectedQuery = "SELECT * FROM osquery_info;";
}
EXPECT_EQ(actualQuery, expectedQuery);
}
}

// Tests https://github.com/osquery/osquery/issues/5260.
TEST_F(DistributedTests, test_accept_work_with_discovery_all_fail) {
auto dist = Distributed();

const std::string work = R"json(
{
"queries": {
"q1": "SELECT * FROM system_info;",
"q2": "SELECT * FROM osquery_info;"
},
"discovery": {
"q1": "SELECT 1 WHERE false;",
"q2": "SELECT 1 WHERE false;"
}
}
)json";
auto s = dist.acceptWork(work);
ASSERT_TRUE(s.ok()) << s.getMessage();

// Both queries q1 and q2 have a discovery query with 0 results,
// thus they won't be executed.
const auto queries = dist.getPendingQueries();
ASSERT_EQ(queries.size(), 0);
}
} // namespace osquery

0 comments on commit cdadfd2

Please sign in to comment.