forked from peetucket/ImpactProbe
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
531 lines (391 loc) · 22 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
Last updated: February 12, 2011
ImpactProbe is an application which can be used to monitor the social impact
of outreach activities. It gathers data from various sources (i.e. Twitter
Search API) according to specified keyword/phrase parameters defined by the
user. Gathered results are presented to the user and can be clustered according
to natural language similarily. Results can also be displayed and downloaded as
a timeline graph (posts/time).
In this README you will find instructions for installing the application as well
as information for developers interested in improving the application. See the
LICENSE file in this same directory for the terms of redistribution and use.
If you have any questions/comments/requests, please contact us:
Adrian Laurenzi (author)
Peter Mangiafico (project manager)
================================================================================
Contents
================================================================================
1. Installation guide
1.1 Linux
1.2 Mac OS X
1.3 Windows XP/Vista/7
2. Developer information
2.1 General information for developers
2.2 Application structure
2.3 Data source APIs
2.4 Uploading changes to git
================================================================================
1. Installation guide
================================================================================
--------------------------------------------------------------------------------
1.1 Linux
--------------------------------------------------------------------------------
1.1.1 Software dependencies
----------------------------------------
Apache, PHP, and MySQL are required to run this application. This software
bundle is known as LAMP and there are many guides available that describe how to
install it on Linux. Here are guides for a few of the most popular Linux distros:
- Ubuntu: https://help.ubuntu.com/community/ApacheMySQLPHP
- Fedora: http://fedorasolved.org/server-solutions/lamp-stack
- Debian: http://wiki.debian.org/LaMp
NOTE: If you are installing Apache, PHP, and MySQL individually it is probably
easiest to install them in the following order: MySQL -> Apache -> PHP.
----------------------------------------
- Apache HTTP Server (version 1 or 2)
Although the application should work on Apache 1 we highly recommend using the
Apache 2, the most current version. You should follow the instructions for
installing PHP and Apache side-by-side according to the PHP installation guide
provided below. The official Apache website is here: http://httpd.apache.org
- PHP (>= version 5.3)
UNIX installation guide for PHP/Apache:
http://www.php.net/manual/en/install.unix.php
Downloads: http://www.php.net/downloads.php
***IMPORTANT NOTE: Be sure to install the cURL extension with PHP***
- MySQL Community Server
You should install each of the following MySQL Community Server packages:
server, client, devel, and shared. Each of these packages can be downloaded
here: http://dev.mysql.com/downloads/mysql#downloads
- Kohana PHP Framework v3 (included in this package)
Official website: http://kohanaframework.org
- Lemur Toolkit
Instructions for installing this toolkit are provided below in this README.
The official website for the Lemur toolkit is here:
http://www.lemurproject.org
- Linux cron
Cron should come natively installed on all Linux distrubutions. If for some
reason you do not have cron installed you can download it from the official
website: http://www.gnu.org/software/mcron
1.1.2 Copying files
----------------------------------------
After installing the required software unzip the application package into the
your Apache DocumentRoot (the root HTTP directory of the web server). The
DocumentRoot path should be listed in your apache configuration file which, if
you are using apache2, should be listed in /usr/local/apache2/conf/httpd.conf
or, if you created a new site, in the config file located in this directory:
/etc/apache2/sites-enabled
NOTE: Depending on your platform, the installation's subdirs may have lost
their permissions due to the zip extraction. Chmod them all to 755 by running
the following command from within the root directory of your application:
find . -type d -exec chmod 0755 {} \;
In this guide ~/ is the root directory of the application.
The following directories MUST be made writeable:
~/data/charts
~/data/lemur/docs
~/data/lemur/indexes
~/data/lemur/params
Use the following UNIX command to make each of the above directories writeable:
chmod 0777 <DIRECTORY>
We recommend that you do not modify the default directory structure
but if you choose to do so you must modify the application's configuration
settings described in the 'Configuration settings' section (1.1.6).
1.1.3 Importing MySQL tables
----------------------------------------
To import the MySQL tables needed by the application you must import the
mysql_tables.sql file provided in the root directory of this package.
First open the MySQL shell and enter your password:
mysql -u <USERNAME> -p
Replace <USERNAME> with your MySQL username (most likely this will be 'root')
While in the shell create a new MySQL database:
create database <DATABASE NAME>;
Replace <DATABASE NAME> with whatever you wish to name your database.
Exit the MySQL shell (type 'exit') and import the MySQL tables using the
following command:
mysql -u <USERNAME> -p <DATABASE NAME> < mysql_tables.sql
It may be useful to install PHPMyAdmin to manage your MySQL database, however
this is not essential. Download PHPMyAdmin here: http://www.phpmyadmin.net
1.1.4 Setting up Kohana (version 3)
----------------------------------------
NOTE: You may need to switch "short_open_tag = On" in your PHP.ini configuration
file and restart Apache. If it is set to "Off", Kohana may not work correctly.
For your convenience we have included the Kohana PHP Framework version 3.0.7
in this software package. Instructions for setting up Kohana 3.0.7 (included
in this package) are described below according to those described in official
Kohana installation guide found on their website:
http://kohanaframework.org/guide/about.install
(1) Open ~/kohana/application/bootstrap.php.example and make the following
changes:
- Set the default timezone for your application.
- Set the base_url in the Kohana::init call to reflect the location of the
kohana folder on your server.
After making the changes save it the file as bootstrap.php in the same
directory.
(2) Make sure the ~/kohana/application/cache and ~/kohana/application/logs
directories exist and are writable by the web server (chmod to 0777).
(3) Open ~/kohana/install.php.example and save it as install.php in the same
directory.
(4) Test your installation by opening the URL you set as the base_url in a
browser. You should see the installation page. If it reports any errors, you
will need to correct them before continuing.
***IMPORTANT NOTE: ImpactProbe requires that cURL is enabled in PHP. So it
must say "Pass" next to "cURL Enabled" under "Optional Tests". It it does
not you must install cURL into PHP: http://php.net/manual/en/book.curl.php
(5) Once your install page reports that your environment is set up correctly you
need to either rename or delete install.php in the root directory.
(6) To configure the MySQL database module copy the database.php file from
~/kohana/modules/database/config into the ~/kohana/application/config
directory. Open the newly copied file and edit the settings to point to your
MySQL database (hostname, username, password, and database).
(7) To make the application accessible from the root directory of the
application rather than ~/kohana you must rename ~/index.php.tmp to
index.php and delete ~/kohana/index.php. Then open
~/kohana/application/bootstrap.php again and set the base_url to the root
directory of the application.
NOTE: by default Kohana is configured to be in 'development' mode so if you plan
to make this application available on a public web server for security reasons
you should put Kohana in 'production' mode:
http://kohanaframework.org/3.0/guide/kohana/security/deploying
1.1.5 Setting up the Lemur Toolkit
----------------------------------------
Download the Lemur Toolkit (version 4.12) here:
https://sourceforge.net/projects/lemur/files/lemur/lemur-toolkit-4.12/
Untar the package and follow the instructions provided in the Lemur INSTALL file
to install the software. Be sure to include the --enable-cluster option when
running the configure script. Take note of the path where Lemur was installed
because you will need it in when establishing the application's configuration
settings (section 1.1.6).
1.1.5 Configuring cron
----------------------------------------
To enable the software to collect data at specified intervals you need to
configure cron. Open up ~/impact_probe.cron provided with the application and
change the directory paths to the Kohana index.php file on each line to point to
your index.php file (this should be in the root directory of the application).
In order for cron to work properly you might also have to change the 'php'
command to the full path to your php executable (typically this is
/usr/local/bin/php). Then issue this command to initialize a crontab:
crontab impact_probe.cron
To display your crontab issue this command:
crontab -l
Some systems may not execute the script unless you specify the full path to php
and use the -q flag. For example, instead of simply calling php, use something like:
/usr/local/bin/php -q <COMMAND>
NOTE: You should be aware that any command scheduled to run at a time when the
computer is turned off will not be executed. Cron can make up for missed
commands if you create bash scripts to execute each of the commands listed in
impact_probe.cron and place each script into the appropriate cron directory
(/etc/cron.hourly, /etc/cron.weekly, etc). However, if you do this you can only
use gathering intervals for which cron folders exist and you have placed a bash
script (for example, 'twice daily' will not work but 'daily' will).
1.1.6 Configuration settings
----------------------------------------
Configuration settings for the application are established in this file:
~/kohana/application/config/myconf.php.example
You will probably have to modify all URLs and paths in this file in order for
your application to work properly. You must change all file paths in
myconf.php.example to point to the root application path on your computer.
Replace '~/' with the full path to the root directory of your application. To
find the full path open the terminal and go to the root directory of your
application and enter the 'pwd' command. All URLs should begin with the HTTP
path to the application (for example http://localhost/ if your application is
in the root directory of your web server). All URLs must include a trailing
slash ('/') at the end of each URL but all file paths should NOT include a
trailing slash. After making the changes to myconf.php.example save the file as
myconf.php in the same directory.
NOTE: It is strongly recommended that you do not change your directory stucture
or configuration settings after you have started using the application because
this is likely to cause problems.
--------------------------------------------------------------------------------
1.2 Mac OS X
--------------------------------------------------------------------------------
This application has not yet be tested on Mac OS X but all the required
software is available for Mac OS X so it shouldn't be difficult to set up. If
you can provide instructions set up on Mac OS X please add them to this README
and upload the changes to the github repository.
1.2.1 Software dependencies (Mac OS X)
----------------------------------------
- Apache HTTP Server (version 1 or 2)
You should follow the instructions for installing PHP and Apache side-by-side
according to the PHP installation guide provided below. The official Apache
website is here: http://httpd.apache.org/
- PHP (>= version 5.3)
Mac OS X installation guide for PHP/Apache:
http://www.php.net/manual/en/install.macosx.php
Downloads: http://www.php.net/downloads.php
- MySQL Community Server
You should install each of the following MySQL Community Server packages:
server, client, devel, and shared. Each of these packages can be downloaded
for Windows here: http://dev.mysql.com/downloads/mysql#downloads
- Kohana PHP Framework v3 (included in this package)
Official website: http://kohanaframework.org/
- Lemur Toolkit
The Lemur Toolkit is available for Windows here:
http://sourceforge.net/projects/lemur/files/
Download lemur-*.tar.gz
- UNIX Cron
Cron should be natively installed on Max OS X. More information on how to use
cron on Mac OS X can be found here:
http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man8/cron.8.html
--------------------------------------------------------------------------------
1.3 Windows XP/Vista/7
--------------------------------------------------------------------------------
This application has not yet be tested on Windows but all the required
software is available for Windows XP (possibly also Vista or Windows 7) so you
should be able to get it set up. Windows XP/Vista/7 should all run Apache, PHP,
and MySQL. CRONw and the Lemur Toolkit can be downloaded for XP and this version
may also work on Windows Vista/7. If you can provide instructions set up on
Windows please add them to this README and upload the changes to the github
repository.
1.3.1 Software dependencies (Windows)
----------------------------------------
- Apache HTTP Server (version 1 or 2)
You should follow the instructions for installing PHP and Apache side-by-side
according to the PHP installation guide provided below. The official Apache
website is here: http://httpd.apache.org/
- PHP (>= version 5.3)
Windows installation guide for PHP/Apache:
http://www.php.net/manual/en/install.windows.php
Windows downloads: http://windows.php.net/download/
- MySQL Community Server
You should install each of the following MySQL Community Server packages:
server, client, devel, and shared. Each of these packages can be downloaded
for Windows here: http://dev.mysql.com/downloads/mysql#downloads
- Kohana PHP Framework v3 (included in this package)
Official website: http://kohanaframework.org/
- Lemur Toolkit
The Lemur Toolkit is available for Windows here:
http://sourceforge.net/projects/lemur/files/
Download either lemur-*-win64-install.exe or lemur-*-install.exe
- CRONw:
CRONw is the Windows version of cron which you can obtain from:
http://cronw.sourceforge.net/
================================================================================
2. Developer information
================================================================================
--------------------------------------------------------------------------------
2.1 General information for developers
--------------------------------------------------------------------------------
This application is written using the Kohana PHP Framework (version 3) which is
an HMVC framework. To contribute in the development of this application you will
have to familiarize with this framework. We chose this framework because it is
open source, under active development, lightweight, secure, and relatively easy
to learn. Be sure to learn Kohana version 3 (KO3) and not version 2, the older
version.
Primary Kohana resources:
- User Guide: http://kohanaframework.org/guide/about.kohana/
- API User Guide: http://kohanaframework.org/guide/api/
- Unofficial wiki: http://kerkness.ca/wiki/doku.php
- Community forum: http://forum.kohanaframework.org/
Other useful Kohana resources:
- Very basic Tutorial: http://kohanaframework.org/guide/tutorials.helloworld/
- Thorough & comprehensive tutorial:
http://www.dealtaker.com/blog/2009/11/20/kohana-php-3-0-ko3-tutorial-part-1/
- Conventions/Style: http://kohanaframework.org/guide/about.conventions/
- Resources for learning Kohana:
http://forum.kohanaframework.org/comments.php?DiscussionID=4691
Kohana is a popular framework so there are lots of resources available on the
Internet so don't be afraid to search around. But be aware that a lot of the
existing guides are for Kohana version 2.
--------------------------------------------------------------------------------
2.2 Application structure
--------------------------------------------------------------------------------
Important files & directories (~/ is the root directory of the application):
~/kohana/application/classes/controller - Main code for application
~/kohana/application/classes/model - Generally used to communicate with database
~/kohana/application/config - URL/path and database configuration settings
~/kohana/application/messages - Error messages
~/kohana/application/views/template.php - Page header & footer
~/kohana/application/views/pages - Page content
~/kohana/modules -
Potentially useful Kohana modules (enable these by modifying
~/kohana/application/bootstrap.php)
~/kohana/system - Kohana system files; in general these should not be modified
~/data -
Lemur Toolkit and Google Chart data is stored here (subfolders named as
the project_id contain data for that project)
~/data/lemur/stopwords.list -
A list of stopwords (one per line) that will be omitted when clustering is
performed
~/js - Javascript files (this application uses the JQuery framework)
~/css - Cascading Style Sheet files
~/images
--------------------------------------------------------------------------------
2.3 Data source APIs
--------------------------------------------------------------------------------
Data source APIs are the places from which the application collects data (for
example the Twitter Search API). Adding additional data source APIs is of utmost
importance because the utility of the application depends upon the amount of
data it is able to access.
2.3.1 Currently implemented data source APIs
----------------------------------------
Currently methods have been implemented (in
~/application/classes/controller/gather.php) to collect data from the following
sources:
- Twitter Search API
Documentation: http://dev.twitter.com/doc/get/search/
- RSS Feeds
More information: http://www.rssboard.org/
2.3.2 Adding additional data source APIs
----------------------------------------
There is one method in ~/application/classes/controller/gather.php for each data
source API (i.e. twitter_search). To allow the application to gather data from
an additional source API you must add a new method to gather.php. Use the `NEW
GATHERING METHOD TEMPLATE` which has been commented out in gather.php (please
leave a copy of the template for other to use). Choose a name for your method
name and modify the code as necessary. Please note that this template assumes
your API responds to either GET or POST requests. In order to activate your
method you must insert a new row into the `api_sources` table in the MySQL
database. The row should contain the name of your method you created in
gather.php and also a 'friendly' name. Here is an example SQL command:
INSERT INTO `api_sources` (`api_name`, `gather_method_name`)
VALUES ('Twitter Search', 'twitter_search');
Now when adding or modifying a project you should see your API appear under the
list with a checkbox next to it.
Suggested additional data source APIs to implement:
- Facebook Open Stream API:
http://wiki.developers.facebook.com/index.php/Using_the_Open_Stream_API/
- Blogger API: http://code.google.com/apis/blogger/
--------------------------------------------------------------------------------
2.4 Uploading changes to git
--------------------------------------------------------------------------------
The github repository for this application is located here:
http://github.com/peetucket/ImpactProbe/
Unlike SVN, git does not used a central repository. This is why git is
"distributed" and SVN is "centralized". Although this makes git an extremely
powerful system for collaborators, tracking changes between various
collaborators can quickly become difficult as multiple forks are created.
Please read the following before working with this code:
- Dealing with newlines: http://github.com/guides/dealing-with-newlines-in-git/
- Submitting changes from your fork:
http://github.com/guides/fork-a-project-and-submit-your-modifications/
- Using SSH keys with github:
http://github.com/guides/how-to-not-have-to-type-your-password-for-every-push/
2.4.1 Managing remote repositories
----------------------------------------
After installing git and signing up for an account, you will need to tell git
about the remote repository:
git remote add ImpactProbe [email protected]:alaurenz/ImpactProbe.git
This adds "ImpactProbe" as a remote repository that can be pulled from.
git checkout -b ImpactProbe/master
This creates a local branch "ImpactProbe/master" of the master branch of the
"ImpactProbe" repository.
After installing ImpactProbe locally you should create a .gitignore file in the
root directory of the application so that you do not track the changes you made
to your config files during the local installation. You can use the
.gitignore.example file provided in the root folder of the application.
2.4.2 Merging changes from remote repositories
----------------------------------------
Now that you have a remote repository, you can pull changes into your local
repository:
git checkout ImpactProbe/master
This switches to the previously created "ImpactProbe/master" branch:
git pull ImpactProbe master
This pulls all of the changes from the remote into the local
"ImpactProbe/master" branch:
git checkout master
This switches back to your local master branch:
git merge ImpactProbe/master
This merges all of the changes in the "ImpactProbe/master" branch into your
master branch:
git push
This pushes all the merged changes into your local fork. Now your fork is now in
sync with the origin repository!