A web scraper build to search specific information for a given compound (and its pseudonyms)
1# Fourmi
2
3**Master branch**: [](https://travis-ci.org/Recondor/Fourmi)
4
5**Developing branch**: [](https://travis-ci.org/Recondor/Fourmi)
6
7Fourmi is an web scraper for chemical substances. The program is designed to be
8used as a search engine to search multiple chemical databases for a specific
9substance. The program will produce all available attributes of the substance
10and conditions associated with the attributes. Fourmi also attempts to estimate
11the reliability of each data point to assist the user in deciding which data
12should be used.
13
14The Fourmi project is open source project licensed under the MIT license. Feel
15free to contribute!
16
17Fourmi is based on the [Scrapy framework](http://scrapy.org/), an open source
18web scraping framework for python. Most of the functionality of this project can
19be traced to this framework. Should the documentation for this application fall
20short, we suggest you take a close look at the [Scrapy architecture]
21(http://doc.scrapy.org/en/latest/topics/architecture.html) and the [Scrapy
22documentation](http://doc.scrapy.org/en/latest/index.html).
23
24### Installing
25
26If you're installing Fourmi, please take a look at our [installation guide](...)
27on our wiki. When you've installed the application, make sure to check our
28[usage guide](...).
29
30### Using the Source
31
32To use the Fourmi source code multiple dependencies are required. Take a look at
33the [wiki page](...) on using the application source code for a step by step
34installation guide.
35
36When developing for the Fourmi project keep in mind that code readability is a
37must. To maintain the readability, code should be conform with the
38[PEP-8](http://legacy.python.org/dev/peps/pep-0008/) style guide for Python
39code. More information about the different structures and principles of the
40Fourmi application can be found on our [wiki](...).
41
42### To Do
43
44The Fourmi project has the following goals for the nearby future:
45
46__Main goals:__
47
48- Improve our documentation and guides. (Assignee: Dekker)
49- Build an graphical user interface(GUI) as alternative for the command line
50interface(CLI). (Assignee: Harmen)
51- Compiling the source into an windows executable. (Assignee: Bas)
52- Create an configuration file to hold logins and API keys.
53- Determine reliability of our data point.
54- Create an module to gather data from NIST. (Assignee: Rob)
55- Create an module to gather data from PubChem. (Assignee: Nout)
56
57__Side goals:__
58
59- Clean and unify data.
60- Extensive reliability analysis using statistical tests.
61- Test data with Descartes 1.
62
63### Project Origin
64
65The Fourmi project was started in February of 2014 as part of a software
66engineering course at the Radboud University for students studying Computer
67Science, Information Science or Artificial Intelligence. Students participate in
68a real software development project as part of the
69[Giphouse](http://www.giphouse.nl/).
70
71This particular project was started on behalf of Ivo B. Rietveld. As a chemist
72he was in need of an application to automatically search information on chemical
73substances and create an phase diagram. The so called "Descrates" project was
74split into two teams each creating a different application that has part of the
75functionality. We are the team Descartes 2 and as we were responsible for
76creating a web crawler, we've named our application Fourmi (Englis: Ants).
77
78The following people were part of the original team:
79
80- [Jip J. Dekker](http://jip.dekker.li)
81- Rob ten Berge
82- Harmen Prins
83- Bas van Berkel
84- Nout van Deijck
85- Michail Kuznetcov