Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFD validation #154

Merged
merged 6 commits into from
Jan 22, 2024
Merged

GFD validation #154

merged 6 commits into from
Jan 22, 2024

Conversation

AntonChern
Copy link
Contributor

Algorithm for checking if a graph functional dependency satisfies a given graph.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


std::vector<Eigen::VectorXi> current = {};
for (int j = 0; j < available_vertices.at(0).size(); ++j) {
Eigen::VectorXi temp = Eigen::VectorXi::Zero(Q.rows());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

rrent = {};
                                   ^

current.push_back(temp);
}
for (int i = 1; i < available_vertices.size(); ++i) {
std::vector<Eigen::VectorXi> new_vecs = {};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

ush_back(temp);
                                              ^

for (const Eigen::VectorXi& vec : current) {
for (int j = 0; j < available_vertices.at(i).size(); ++j) {
bool contains = false;
int index = available_vertices.at(i).at(j);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

; j < available_vertices.at(i).size(); ++j) {
    ^

(result.begin() + i)->push_back(weight);
} else {
sort(result.begin(), result.end(), [](std::vector<int> a, std::vector<int> b) {
return std::accumulate(a.begin(), a.end(), 0, std::plus<int>()) <
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: Value stored to 'cur' during its initialization is never read [clang-analyzer-deadcode.DeadStores]

step
               ^

src/algorithms/gfd/gfd_validation.cpp:310: Value stored to 'cur' during its initialization is never read

step
               ^

(result.begin() + i)->push_back(weight);
} else {
sort(result.begin(), result.end(), [](std::vector<int> a, std::vector<int> b) {
return std::accumulate(a.begin(), a.end(), 0, std::plus<int>()) <
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: unused variable 'cur' [clang-diagnostic-unused-variable]

step
               ^

} else {
sort(result.begin(), result.end(), [](std::vector<int> a, std::vector<int> b) {
return std::accumulate(a.begin(), a.end(), 0, std::plus<int>()) <
std::accumulate(b.begin(), b.end(), 0, std::plus<int>());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

eted_large) {
                             ^

this->result = this->getSatisfiedGFDs(this->graph, this->gfds, std::thread::hardware_concurrency());

auto elapsed_milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now() - start_time);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

t(gfd_index);
                                                       ^

std::vector<GFD> getSatisfiedGFDs(const Graph&, const std::vector<GFD>&, const int&);
public:
GFDValidation() = default;
GFDValidation(Graph graph_, std::vector<GFD> gfds_) : graph(graph_), gfds(gfds_) {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: explicitly defaulted default constructor is implicitly deleted [clang-diagnostic-defaulted-function-deleted]

GFDValidation() = default;
^

src/algorithms/gfd/gfd_validation.h:11: default constructor of 'GFDValidation' is implicitly deleted because field 'graph' has no default constructor

ph graph;
   ^

this->indices.insert(index);
max = max < index ? index : max;
}
if (max != this->indices.size() - 1) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

x;
                       ^

@AntonChern AntonChern force-pushed the gfd branch 2 times, most recently from b194a89 to 45ec1c9 Compare December 5, 2022 00:37
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


std::vector<Eigen::VectorXi> current = {};
for (int j = 0; j < available_vertices.at(0).size(); ++j) {
Eigen::VectorXi temp = Eigen::VectorXi::Zero(Q.rows());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

> current = {};
                                     ^

}
for (int i = 1; i < available_vertices.size(); ++i) {
std::vector<Eigen::VectorXi> new_vecs = {};
for (const Eigen::VectorXi &vec : current) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

< available_vertices.size(); ++i) {
^

for (const Eigen::VectorXi &vec : current) {
for (int j = 0; j < available_vertices.at(i).size(); ++j) {
bool contains = false;
int index = available_vertices.at(i).at(j);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

(int j = 0; j < available_vertices.at(i).size(); ++j) {
              ^

sort(result.begin(), result.end(),
[](std::vector<int> a, std::vector<int> b) {
return std::accumulate(a.begin(), a.end(), 0, std::plus<int>()) <
std::accumulate(b.begin(), b.end(), 0, std::plus<int>());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

r (const int &weight : deleted_large) {
                                                   ^

auto elapsed_milliseconds =
std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now() - start_time);
return elapsed_milliseconds.count();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

i < gfds.size(); ++i) {
  ^


public:
GFDValidation() = default;
GFDValidation(Graph graph_, std::vector<GFD> gfds_)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: explicitly defaulted default constructor is implicitly deleted [clang-diagnostic-defaulted-function-deleted]

ublic:
          ^

src/algorithms/gfd/gfd_validation.h:11: default constructor of 'GFDValidation' is implicitly deleted because field 'graph' has no default constructor

ph graph;
   ^

this->indices.insert(index);
max = max < index ? index : max;
}
if (max != this->indices.size() - 1) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

: max;
                       ^

@AntonChern AntonChern force-pushed the gfd branch 2 times, most recently from 933c048 to 8e46320 Compare December 13, 2022 11:10
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


namespace algos {

void GFDValidation::FitInternal(model::IDatasetStream& data_stream) {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: unused parameter 'data_stream' [clang-diagnostic-unused-parameter]

lgos {
                                                                    ^


std::vector<Eigen::VectorXi> current = {};
for (int j = 0; j < available_vertices.at(0).size(); ++j) {
Eigen::VectorXi temp = Eigen::VectorXi::Zero(Q.rows());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

Xi> current = {};
                                       ^

}
for (int i = 1; i < available_vertices.size(); ++i) {
std::vector<Eigen::VectorXi> new_vecs = {};
for (const Eigen::VectorXi &vec : current) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

i < available_vertices.size(); ++i) {
  ^

for (const Eigen::VectorXi &vec : current) {
for (int j = 0; j < available_vertices.at(i).size(); ++j) {
bool contains = false;
int index = available_vertices.at(i).at(j);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

r (int j = 0; j < available_vertices.at(i).size(); ++j) {
                ^

return std::accumulate(a.begin(), a.end(), 0, std::plus<int>()) <
std::accumulate(b.begin(), b.end(), 0, std::plus<int>());
});
result.begin()->push_back(weight);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

f (i < m - large_processors.size()) {
     ^

auto elapsed_milliseconds =
std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now() - start_time);
std::cout << "Satisfied GFDs: " << result.size()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

gfd_index);
                                             ^

size_t vertices_num = stoi(sizes.at(0));
size_t edges_num = stoi(sizes.at(1));
for (int i = 0; i < vertices_num; ++i) {
std::getline(stream, line);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [clang-diagnostic-sign-compare]

at(1));
                             ^

}
for (int i = 0; i < edges_num; ++i) {
std::getline(stream, line);
auto indices = Parser::split(line.substr(0, line.find("[")), "->");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [clang-diagnostic-sign-compare]

= 0; i < edges_num; ++i) {
       ^

size_t edges_num = stoi(sizes.at(1));
for (int i = 0; i < vertices_num; ++i) {
std::map<std::string, std::string> attrs;
std::getline(stream, line);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [clang-diagnostic-sign-compare]

or (int i = 0; i < vertices_num; ++i) {
                 ^

}
for (int i = 0; i < edges_num; ++i) {
std::getline(stream, line);
auto indices = Parser::split(line.substr(0, line.find("[")), "->");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [clang-diagnostic-sign-compare]

es.emplace(index, attrs);
                                                    ^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

(result.begin() + i)->push_back(weight);
}
else {
sort(result.begin(), result.end(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

ge) {
                     ^


namespace algos {

void GFDValidation::FitInternal(model::IDatasetStream& data_stream) {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: unused parameter 'data_stream' [clang-diagnostic-unused-parameter]

e algos {
                                                                    ^

}

template <typename graph_type>
graph_type GFDValidation::getSubgraph(const graph_type& graph,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: non-void function does not return a value in all control paths [clang-diagnostic-return-type]

ttern[*i].node_id;
                                      ^

src/algorithms/gfd/gfd_validation.cpp:280: in instantiation of function template specialization 'algos::GFDValidation::getCenter<boost::adjacency_list<boost::vecS, boost::vecS, boost::bidirectionalS, vertex, edge>>' requested here

s = 0;
                                  ^

auto in = boost::in_degree(*i, graph);
if (out >= degrees.first && in >= degrees.second &&
graph[*i].attributes.at("label") == label) {
result.push_back(graph[*i].node_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'typename config::degree_size_type' (aka 'unsigned long') and 'const int' [clang-diagnostic-sign-compare]

raph);
                        ^

auto in = boost::in_degree(*i, graph);
if (out >= degrees.first && in >= degrees.second &&
graph[*i].attributes.at("label") == label) {
result.push_back(graph[*i].node_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'typename config::degree_size_type' (aka 'unsigned long') and 'const int' [clang-diagnostic-sign-compare]

raph);
                                               ^

src/algorithms/gfd/gfd_validation.cpp:285: in instantiation of function template specialization 'algos::GFDValidation::getCandidateVertices<boost::adjacency_list<boost::vecS, boost::vecS, boost::bidirectionalS, vertex, edge>>' requested here

or<int> candidate_vertices = getCandidateVertices(
                             ^

unsigned long long GFDValidation::ExecuteInternal() {
auto start_time = std::chrono::system_clock::now();

this->result = this->generateSatisfiedGFDs(this->graph, this->gfds,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

fied.insert(gfd_index);
                                                                 ^

@polyntsov polyntsov marked this pull request as draft February 11, 2023 12:26
@Mstrutov
Copy link
Collaborator

Mstrutov commented May 2, 2023

весь ПР ещё не смотрел + возможно это WIP, но большие датасеты (>100k строк) надо убирать из input_data. например, это увеличивает объём репо. вместо этого нужно использовать отдельный репозиторий https://github.com/Mstrutov/Desbordante-Data, откуда ./pull_datasets.sh берёт датасеты

@polyntsov
Copy link
Collaborator

весь ПР ещё не смотрел + возможно это WIP, но большие датасеты (>100k строк) надо убирать из input_data. например, это увеличивает объём репо. вместо этого нужно использовать отдельный репозиторий https://github.com/Mstrutov/Desbordante-Data, откуда ./pull_datasets.sh берёт датасеты

И видимо класть их git lfs всё ещё

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 26. Check the log or trigger a new build to see more.

(result.begin() + i)->push_back(weight);
}
else {
sort(result.begin(), result.end(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

ge) {
                     ^

}
}
return result;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'typename config::degree_size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

< boost::degree(*adjacency_it, graph)) {
^

return true;
}

int EGFDValidation::getRoot(const graph_t& graph, const graph_t& query, const std::set<vertex_t>& core) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'std::map<std::basic_string, unsigned long>::mapped_type' (aka 'unsigned long') and 'const int' [clang-diagnostic-sign-compare]

grees.find(label) == graph_label_degrees.end() ||
                                                                                             ^


int b_degree = boost::degree(b, query);
int bn = 0;
for (const vertex_t& e : label_classes.at(query[b].attributes.at("label"))) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'typename config::degree_size_type' (aka 'unsigned long') and 'int' [clang-diagnostic-sign-compare]

utes.at("label"))) {
                                                              ^

return an / a_degree < bn / b_degree;
};
std::sort(order.begin(), order.end(), cmp1);
int top = std::min(int(order.size()), 3);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'typename config::degree_size_type' (aka 'unsigned long') and 'int' [clang-diagnostic-sign-compare]

         if (boost::degree(e, graph) >= b_degree) {
                                     ^

std::set<vertex_t> to_delete = {};
for (auto& link : parent) {
if (indices.find(link.first) == indices.end()) {
to_delete.insert(link.first);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'unsigned long' [clang-diagnostic-sign-compare]

(j == core.size() - 1) {
   ^

for (boost::tie(i, end) = vertices(pattern); i != end; ++i) {
bool is_center = true;
typename boost::graph_traits<graph_t>::vertex_iterator j;
for (j = vertices(pattern).first; j != end; ++j) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: variable 'result' is used uninitialized whenever 'for' loop exits because its condition is false [clang-diagnostic-sometimes-uninitialized]

rn); i != end; ++i) {
     ^
Additional context

src/core/algorithms/gfd/gfd_validation.cpp:128: uninitialized use occurs here

k;
                                                                                                                  ^

src/core/algorithms/gfd/gfd_validation.cpp:114: remove the condition if it is always true

rn); i != end; ++i) {
     ^

src/core/algorithms/gfd/gfd_validation.cpp:111: initialize the variable 'result' to silence this warning

, em);
                                          ^


// print
//for (int i = 0; i < processors_num; ++i) {
// for (auto& e :requests.at(i)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

    for (int i = 0; i < partition.size(); ++i) {
                      ^

balanced_messages.push_back(messages);
}

std::vector<GFD> result = {};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::vector::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

  for (int i = 0; i < balanced_weights.size(); ++i) {
                    ^

this->result = this->generateSatisfiedGFDs(this->graph, this->gfds);

auto elapsed_milliseconds =
std::chrono::duration_cast<std::chrono::milliseconds>(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::map<int, GFD>::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

satisfied.insert(ind);
                                                          ^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

for (std::size_t i = 0; i < m; ++i) {
// the first value is index
std::vector<int> temp = { i };
result.push_back(temp);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: non-constant-expression cannot be narrowed from type 'std::size_t' (aka 'unsigned long') to 'int' in initializer list [clang-diagnostic-c++11-narrowing]

<int> temp = { i };
               ^
Additional context

src/core/algorithms/gfd/balancer.cpp:21: insert an explicit cast to silence this issue

<int> temp = { i };
               ^

return seq;
}

bool EGFDValidation::ValidateNT(const graph_t& graph, const vertex_t& v, const graph_t& query, const vertex_t& u,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: Division by zero [clang-analyzer-core.DivideZero]

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                 ^
Additional context

src/core/algorithms/gfd/egfd_validation.cpp:640: Calling 'EGFDValidation::matching_order'

s.push_back(nt);
                                                         ^

src/core/algorithms/gfd/egfd_validation.cpp:582: Assuming the condition is true

d(), std::back_inserter(seq));
                                           ^

src/core/algorithms/gfd/egfd_validation.cpp:582: Loop condition is true. Entering loop body

d(), std::back_inserter(seq));
                                    ^

src/core/algorithms/gfd/egfd_validation.cpp:594: Calling 'min_element<__gnu_cxx::__normal_iterator<std::vector *, std::vector<std::vector>>, (lambda at /github/workspace/src/core/algorithms/gfd/egfd_validation.cpp:577:20)>'

_ = *std::min_element(paths.begin(), paths.end(), cmp);
     ^

/usr/include/c++/11/bits/stl_algo.h:5674: Calling '__min_element<__gnu_cxx::__normal_iterator<std::vector *, std::vector<std::vector>>, __gnu_cxx::__ops::_Iter_comp_iter<(lambda at /github/workspace/src/core/algorithms/gfd/egfd_validation.cpp:577:20)>>'

      return _GLIBCXX_STD_A::__min_element(__first, __last,
             ^

/usr/include/x86_64-linux-gnu/c++/11/bits/c++config.h:419: expanded from macro '_GLIBCXX_STD_A'

# define _GLIBCXX_STD_A std
                        ^

/usr/include/c++/11/bits/stl_algo.h:5619: Taking false branch

      if (__first == __last)
      ^

/usr/include/c++/11/bits/stl_algo.h:5622: Loop condition is true. Entering loop body

      while (++__first != __last)
      ^

/usr/include/c++/11/bits/stl_algo.h:5623: Calling '_Iter_comp_iter::operator()'

	if (__comp(__first, __result))
     ^

/usr/include/c++/11/bits/predefined_ops.h:157: Calling 'operator()'

        { return bool(_M_comp(*__it1, *__it2)); }
                      ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Calling 'EGFDValidation::candidates_cardinality'

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                   ^

src/core/algorithms/gfd/egfd_validation.cpp:549: Returning zero

  return edge_cans.second.size();
                                                         ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Returning from 'EGFDValidation::candidates_cardinality'

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                   ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Division by zero

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                 ^

return seq;
}

bool EGFDValidation::ValidateNT(const graph_t& graph, const vertex_t& v, const graph_t& query, const vertex_t& u,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: Division by zero [clang-analyzer-core.DivideZero]

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                                                                                                              ^
Additional context

src/core/algorithms/gfd/egfd_validation.cpp:640: Calling 'EGFDValidation::matching_order'

s.push_back(nt);
                                                         ^

src/core/algorithms/gfd/egfd_validation.cpp:582: Assuming the condition is true

d(), std::back_inserter(seq));
                                           ^

src/core/algorithms/gfd/egfd_validation.cpp:582: Loop condition is true. Entering loop body

d(), std::back_inserter(seq));
                                    ^

src/core/algorithms/gfd/egfd_validation.cpp:594: Calling 'min_element<__gnu_cxx::__normal_iterator<std::vector *, std::vector<std::vector>>, (lambda at /github/workspace/src/core/algorithms/gfd/egfd_validation.cpp:577:20)>'

_ = *std::min_element(paths.begin(), paths.end(), cmp);
     ^

/usr/include/c++/11/bits/stl_algo.h:5674: Calling '__min_element<__gnu_cxx::__normal_iterator<std::vector *, std::vector<std::vector>>, __gnu_cxx::__ops::_Iter_comp_iter<(lambda at /github/workspace/src/core/algorithms/gfd/egfd_validation.cpp:577:20)>>'

      return _GLIBCXX_STD_A::__min_element(__first, __last,
             ^

/usr/include/x86_64-linux-gnu/c++/11/bits/c++config.h:419: expanded from macro '_GLIBCXX_STD_A'

# define _GLIBCXX_STD_A std
                        ^

/usr/include/c++/11/bits/stl_algo.h:5619: Taking false branch

      if (__first == __last)
      ^

/usr/include/c++/11/bits/stl_algo.h:5622: Loop condition is true. Entering loop body

      while (++__first != __last)
      ^

/usr/include/c++/11/bits/stl_algo.h:5623: Calling '_Iter_comp_iter::operator()'

	if (__comp(__first, __result))
     ^

/usr/include/c++/11/bits/predefined_ops.h:157: Calling 'operator()'

        { return bool(_M_comp(*__it1, *__it2)); }
                      ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Calling 'EGFDValidation::candidates_cardinality'

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                                                                                                                ^

src/core/algorithms/gfd/egfd_validation.cpp:549: Returning zero

  return edge_cans.second.size();
                                                         ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Returning from 'EGFDValidation::candidates_cardinality'

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                                                                                                                ^

src/core/algorithms/gfd/egfd_validation.cpp:592: Division by zero

pi, a, a_origin) / candidates_cardinality(cpi, a_origin) <
                                                                                                              ^

}

while ((match.at(i).first != match.at(i).second) && (visited(*match.at(i).first, i) ||
!ValidateNT(graph, *match.at(i).first, query, seq.at(i), seq, parent, match))) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'std::size_t' (aka 'unsigned long') and 'int' [clang-diagnostic-sign-compare]

        std::size_t i = core.size() - 1;
                                                                                  ^

std::pair<vertex_t, vertex_t> edge(parent.at(seq.at(k)), seq.at(k));
std::size_t index = std::find(seq.begin(), seq.end(), parent.at(seq.at(k))) - seq.begin();
std::pair<std::set<vertex_t>::iterator, std::set<vertex_t>::iterator>
its(cpi.at(edge).at(*match.at(index).first).begin(), cpi.at(edge).at(*match.at(index).first).end());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'std::size_t' (aka 'unsigned long') and 'int' [clang-diagnostic-sign-compare]

          i++;
                                         ^


namespace algos {

void NaiveGFDValidation::FitInternal(model::IDatasetStream& data_stream) {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: unused parameter 'data_stream' [clang-diagnostic-unused-parameter]

os {
                                                                    ^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

}

while ((match.at(i).first != match.at(i).second) && (visited(*match.at(i).first, i) ||
!ValidateNT(graph, *match.at(i).first, query, seq.at(i), seq, parent, match))) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: comparison of integers of different signs: 'int' and 'std::set::size_type' (aka 'unsigned long') [clang-diagnostic-sign-compare]

t i = static_cast<int>(core.size()) - 1;
                                                            ^

@AntonChern AntonChern marked this pull request as ready for review October 16, 2023 17:42
src/core/algorithms/algorithm_types.h Show resolved Hide resolved
.github/workflows/cmake.yml Outdated Show resolved Hide resolved
datasets/datasets.zip Outdated Show resolved Hide resolved
src/core/config/names.h Outdated Show resolved Hide resolved
src/core/config/descriptions.h Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/balancer.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/graph_descriptor.h Outdated Show resolved Hide resolved
@Mstrutov
Copy link
Collaborator

Also, you will have to rework the commits themselves. E.g., group "refactor" commits, remove commits on eigenvalues, squash development steps, etc. Finally, write descriptive commit messages. You can do this before the review-based refactor or after --- which would be more convenient for you.

This is necessary to keep the commit history readable and informative. On the contrary, it is a bad practice to just squash everything into a single 2k-lines-of-code commit, as it would be hard to review and make sense of the reasons which lead to a particular change.

Personally, I have the following criteria for the perfect commit list:

  • All the code made it into the PR final version, i.e. there are no intermediate commits. E.g., in your case, there are several commits on eigenvalues, which were removed at some point, therefore there were no eigenvalues before the PR, and there will be no eigenvalues after the PR. Say, the PR as a whole changes the code state from A to D. We are usually interested only in the final result, the whole A->[B->C->B->B'->C']->D chain can and should be omitted.
  • Each commit targets one conceptual change, e.g. "Add GFD algorithm", "Add efficient GFD algortihm", "Add tests".
  • Each commit has a descriptive message.

This is a pretty subjective topic, but I hope my comment will point you in a right direction.
See examples of what I view as a good commit list: https://github.com/Mstrutov/Desbordante/pull/215/commits, https://github.com/Mstrutov/Desbordante/pull/139/commits

@AntonChern AntonChern force-pushed the gfd branch 2 times, most recently from 5c0f5b8 to 4a99392 Compare November 20, 2023 17:41
@AntonChern AntonChern force-pushed the gfd branch 4 times, most recently from dfc4049 to 653a298 Compare December 4, 2023 15:23
if (i < m - numOfLargeProcs) {
(result.begin() + i)->push_back(weight);
} else {
sort(result.begin(), result.end(), [](std::vector<int> a, std::vector<int> b) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::vector<int> const & to avoid making unnecessary copies

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto cGreater at 114 still has this problem

Copy link
Collaborator

@Mstrutov Mstrutov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

There is one pending comment from my previous review, @vs9h comments, and there is some readability refactoring in egfd_validation to be done (for inspiration, see HyFD and HyUCC along with their common components in hycommon). Other than that, it seems, I have nothing to add.

src/core/algorithms/gfd/egfd_validation.cpp Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/egfd_validation.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.h Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.h Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.h Outdated Show resolved Hide resolved
src/core/algorithms/gfd/parser.h Outdated Show resolved Hide resolved
src/core/algorithms/gfd/gfd_handler.cpp Outdated Show resolved Hide resolved
src/core/algorithms/gfd/gfd_handler.h Outdated Show resolved Hide resolved
@AntonChern AntonChern force-pushed the gfd branch 3 times, most recently from 63b3453 to c1d4ed1 Compare January 8, 2024 17:38
Comment on lines 12 to 14
std::vector<std::string> Split(std::string str, std::string sep);
std::vector<Literal> ParseLiterals(std::istream& stream);
void WriteLiterals(std::ostream& stream, std::vector<Literal> const& literals);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are needed only for implementation, so it is better not to declare them in the header.
Don't forget to wrap these function to anonymous namespace in the source file.

Comment on lines 25 to 26
Balancer() = default;
~Balancer() = default;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed

Comment on lines 333 to 335
Balancer* balancer = new Balancer();
std::vector<std::vector<int>> balanced_weights = balancer->Balance(weights, threads_num_);
delete balancer;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Balancer* balancer = new Balancer();
std::vector<std::vector<int>> balanced_weights = balancer->Balance(weights, threads_num_);
delete balancer;
Balancer balancer;
std::vector<std::vector<int>> balanced_weights = balancer.Balance(weights, threads_num_);

Added classes for working with graphs and GFDs.
Graphs are presented as boost graphs.
The parser provides methods for writing and reading graphs and GFDs from
a file. The graph is represented in text format in the DOT language.
To represent GFD, lists of literals are indicated before the
graph-pattern (premises are the first line, conclusion is the second)
in the format index.feature=name (index1.feature1=index2.feature2)
separated by a space.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -55,9 +56,25 @@ BETTER_ENUM(AlgorithmType, char,

/* UCC verifier algorithm */
ucc_verifier,
<<<<<<< HEAD

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: version control conflict marker in file [clang-diagnostic-error]

<<<<<<< HEAD
^

<<<<<<< HEAD
gfdvalid
>>>>>>> Implement baseline GFD validation algorithm
=======

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected '= constant-expression' or end of enumerator definition [clang-diagnostic-error]

=======
^

<<<<<<< HEAD
gfdvalid
>>>>>>> Implement baseline GFD validation algorithm
=======

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

=======
  ^

<<<<<<< HEAD
gfdvalid
>>>>>>> Implement baseline GFD validation algorithm
=======

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

=======
    ^

<<<<<<< HEAD
gfdvalid
>>>>>>> Implement baseline GFD validation algorithm
=======

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

=======
      ^

gfdvalid
>>>>>>> Implement baseline GFD validation algorithm
=======
gfdvalid,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use of undeclared identifier 'gfdvalid'; did you mean 'egfdvalid'? [clang-diagnostic-error]

Suggested change
gfdvalid,
egfdvalid,

(skipping 42 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)

>>>>>>> Implement baseline GFD validation algorithm
=======
gfdvalid,
<<<<<<< HEAD

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: version control conflict marker in file [clang-diagnostic-error]

<<<<<<< HEAD
^

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -6,10 +6,25 @@

namespace algos {

<<<<<<< HEAD

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: version control conflict marker in file [clang-diagnostic-error]

<<<<<<< HEAD
^

fd_verifier::FDVerifier, HyUCC, PyroUCC, cfd::FDFirstAlgorithm,
ACAlgorithm, UCCVerifier, Faida, GfdValidation, EGfdValidation>;
>>>>>>> 8880e8c... Implement efficient GFD validation algorithm
=======

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected unqualified-id [clang-diagnostic-error]

=======
^

metric::MetricVerifier, DataStats, fd_verifier::FDVerifier, HyUCC, PyroUCC,
cfd::FDFirstAlgorithm, ACAlgorithm, UCCVerifier, Faida, GfdValidation,
EGfdValidation, NaiveGfdValidation>;
>>>>>>> 6a7de71... Implement naive GFD validation algorithm

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected unqualified-id [clang-diagnostic-error]

>>>>>>> 6a7de71... Implement naive GFD validation algorithm
^

Algorithm for checking if a graph functional dependency satisfies a
given graph. Algorithm can be run from the console.
The algorithm checking the satisfiability of GFD.
Uses CPI algorithm as a subgraph search.
GFD validation algorithm that uses VF2 algorithm to find a subgraph.
Input data is presented as dot-files.
Undirected graphs are used.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected identifier [clang-diagnostic-error]

<<<<<<< HEAD
^

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected unqualified-id [clang-diagnostic-error]

<<<<<<< HEAD
^

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

<<<<<<< HEAD
  ^

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

<<<<<<< HEAD
    ^

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected expression [clang-diagnostic-error]

<<<<<<< HEAD
      ^

@@ -57,7 +58,12 @@ BETTER_ENUM(AlgorithmType, char,
ucc_verifier,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use of undeclared identifier 'HEAD' [clang-diagnostic-error]

<<<<<<< HEAD
        ^

@polyntsov polyntsov merged commit 1b5d8fc into Desbordante:main Jan 22, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants