README.md 6.02 KB
Newer Older
1 2
![](logo.png)

3

Fize Jacques's avatar
Fize Jacques committed
4 5
[![Build Status](https://travis-ci.com/Jacobe2169/GMatch4py.svg?branch=master)](https://travis-ci.com/Jacobe2169/GMatch4py)
# GMatch4py a graph matching library for Python 
6

Fize Jacques's avatar
Fize Jacques committed
7

Fize Jacques's avatar
Fize Jacques committed
8 9
GMatch4py is a library dedicated to graph matching. Graph structure are stored in NetworkX graph objects.
GMatch4py algorithms were implemented with Cython to enhance performance.
Fize Jacques's avatar
Fize Jacques committed
10 11

## Requirements
Fize Jacques's avatar
Fize Jacques committed
12 13 14

 * Python 3
 * Numpy and Cython installed (if not : `(sudo) pip(3) install numpy cython`)
Fize Jacques's avatar
Fize Jacques committed
15 16 17
 
## Installation

Fize Jacques's avatar
Fize Jacques committed
18
To install `GMatch4py`, run the following commands:
Fize Jacques's avatar
Fize Jacques committed
19

20 21 22
```bash
git clone https://github.com/Jacobe2169/GMatch4py.git
cd GMatch4py
23
(sudo) pip(3) install .
Fize Jacques's avatar
Fize Jacques committed
24 25 26
```

## Get Started
27
### Graph input format
Fize Jacques's avatar
Fize Jacques committed
28

29
In `GMatch4py`, algorithms manipulate `networkx.Graph`, a complete graph model that 
30 31
comes with a large spectrum of parser to load your graph from various inputs : `*.graphml,*.gexf,..` (check [here](https://networkx.github.io/documentation/stable/reference/readwrite/index.html) to see all the format accepted)

Fize Jacques's avatar
Fize Jacques committed
32
### Use GMatch4py
33 34
If you want to use algorithms like *graph edit distances*, here is an example:

35
```python
36 37 38
# Gmatch4py use networkx graph 
import networkx as nx 
# import the GED using the munkres algorithm
Fize Jacques's avatar
Fize Jacques committed
39
import gmatch4py as gm
40 41 42
```

In this example, we use generated graphs using `networkx` helpers:
43
```python
44 45 46 47
g1=nx.complete_bipartite_graph(5,4) 
g2=nx.complete_bipartite_graph(6,4)
```

Fize Jacques's avatar
Fize Jacques committed
48
All graph matching algorithms in `Gmatch4py` work this way:
49 50 51
 * Each algorithm is associated with an object, each object having its specific parameters. In this case, the parameters are the edit costs (delete a vertex, add a vertex, ...)
 * Each object is associated with a `compare()` function with two parameters. First parameter is **a list of the graphs** you want to **compare**, i.e. measure the distance/similarity (depends on the algorithm). Then, you can specify a sample of graphs to be compared to all the other graphs. To this end, the second parameter should be **a list containing the indices** of these graphs (based on the first parameter list). If you rather compute the distance/similarity **between all graphs**, just use the `None` value.

52
```python
Fize Jacques's avatar
Fize Jacques committed
53
ged=gm.GraphEditDistance(1,1,1,1) # all edit costs are equal to 1
54 55 56 57 58
result=ged.compare([g1,g2],None) 
print(result)
```

The output is a similarity/distance matrix :
59
```python
60
Out[10]:
61 62
array([[0., 14.],
       [10., 0.]])
63 64 65
```
This output result is "raw", if you wish to have normalized results in terms of distance (or similarity) you can use :

66
```python
67 68 69 70
ged.similarity(result)
# or 
ged.distance(result)
```
Fize Jacques's avatar
Fize Jacques committed
71

Fize Jacques's avatar
Fize Jacques committed
72
## Exploit nodes and edges attributes
Fize Jacques's avatar
Fize Jacques committed
73

Fize Jacques's avatar
Fize Jacques committed
74 75 76 77 78 79 80 81
In this latest version, we add the possibility to exploit graph attributes ! To do so, the `base.Base` is extended with the `set_attr_graph_used(node_attr,edge_attr)` method.

```python
import networkx as nx 
import gmatch4py as gm
ged = gm.GraphEditDistance(1,1,1,1)
ged.set_attr_graph_used("theme","color") # Edge colors and node themes attributes will be used.
```
Fize Jacques's avatar
Fize Jacques committed
82 83

## List of algorithms
Fize Jacques's avatar
Fize Jacques committed
84

Fize Jacques's avatar
Fize Jacques committed
85 86 87
 * Graph Embedding
    * Graph2Vec [1]
    * DeepWalk [7]
Fize Jacques's avatar
Fize Jacques committed
88 89 90 91 92 93 94 95 96 97 98 99
 * Graph kernels
    * Random Walk Kernel (*debug needed*) [3]
        * Geometrical 
        * K-Step 
    * Shortest Path Kernel [3]
    * Weisfeiler-Lehman Kernel [4]
        * Subtree Kernel 
 * Graph Edit Distance [5]
    * Approximated Graph Edit Distance 
    * Hausdorff Graph Edit Distance 
    * Bipartite Graph Edit Distance 
    * Greedy Edit Distance
Fize Jacques's avatar
Fize Jacques committed
100 101 102 103
 * Vertex Ranking [2]
 * Vertex Edge Overlap [2]
 * Bag of Nodes (a bag of words model using nodes as vocabulary)
 * Bag of Cliques (a bag of words model using cliques as vocabulary)
Fize Jacques's avatar
Fize Jacques committed
104 105 106 107 108
 * MCS [6]
    

## Publications associated

Fize Jacques's avatar
Fize Jacques committed
109
  * [1] Narayanan, Annamalai and Chandramohan, Mahinthan and Venkatesan, Rajasekar and Chen, Lihui and Liu, Yang. Graph2vec: Learning distributed representations of graphs. MLG 2017, 13th International Workshop on Mining and Learning with Graphs (MLGWorkshop 2017).
Fize Jacques's avatar
Fize Jacques committed
110 111 112 113 114
  * [2] Papadimitriou, P., Dasdan, A., & Garcia-Molina, H. (2010). Web graph similarity for anomaly detection. Journal of Internet Services and Applications, 1(1), 19-30.
  * [3] Vishwanathan, S. V. N., Schraudolph, N. N., Kondor, R., & Borgwardt, K. M. (2010). Graph kernels. Journal of Machine Learning Research, 11(Apr), 1201-1242.
  * [4] Shervashidze, N., Schweitzer, P., Leeuwen, E. J. V., Mehlhorn, K., & Borgwardt, K. M. (2011). Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(Sep), 2539-2561.
  * [5] Fischer, A., Riesen, K., & Bunke, H. (2017). Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment. Pattern Recognition Letters, 87, 55-62.
  * [6] A graph distance metric based on the maximal common subgraph, H. Bunke and K. Shearer, Pattern Recognition Letters, 1998  
Fize Jacques's avatar
Fize Jacques committed
115
  * [7] Perozzi, B., Al-Rfou, R., & Skiena, S. (2014, August). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701-710). ACM.
Fize Jacques's avatar
Fize Jacques committed
116

Fize Jacques's avatar
Fize Jacques committed
117 118 119
## Author(s)

Jacques Fize, *jacques[dot]fize[at]cirad[dot]fr*
Fize Jacques's avatar
Fize Jacques committed
120

121
Some algorithms from other projects were integrated to Gmatch4py. **Be assured that
Fize Jacques's avatar
Fize Jacques committed
122
each code is associated with a reference to the original.**
Fize Jacques's avatar
Fize Jacques committed
123

124

125 126
## CHANGELOG

Fize Jacques's avatar
Fize Jacques committed
127 128 129 130 131 132 133 134
### 05.03.2019

 * Add Graph Embedding algorithms
 * Remove depreciated methods and classes
 * Add logo
 * Update documentation


135 136 137 138
### 25.02.2019
 * Add New Graph Class. Features : Cython Extensions, precomputed values (degrees, neighbor info), hash representation of edges and nodes for a faster comparison
 * Some algorithms are parallelized such as graph edit distances or Jaccard

Fize Jacques's avatar
Fize Jacques committed
139
## TODO List
Fize Jacques's avatar
Fize Jacques committed
140

Fize Jacques's avatar
Fize Jacques committed
141
  * Debug algorithms --> Random Walk Kernel, Deltacon
142
  * Optimize algorithms --> Vertex Ranking
143
=======
Fize Jacques's avatar
Fize Jacques committed
144 145 146 147 148
## Improvements

GMatch4py is going through some heavy changes to diminish the time execution of each algorithm. You may found an alpha version available in the branch `graph_cython`.

As of today, the results are promising (up to ![](https://latex.codecogs.com/gif.latex?%5Ctimes)36 for Jaccard)
Fize Jacques's avatar
Fize Jacques committed
149
![](https://i.ibb.co/qMZRHhN/multiplicator-2.png)
Fize Jacques's avatar
Fize Jacques committed
150
## TODO List
Fize Jacques's avatar
Fize Jacques committed
151

152
  * Debug algorithms --> :runner: (almost done !)