Skip to content

Commit

Permalink
Merge 'develop' for gtools-1.0.3; exit if too many obs: SPI 32-bit in…
Browse files Browse the repository at this point in the history
…teger bug

* Closes #43
* Added option to selectively test via gtools, test[]
* Gtools exits with error if `_N > 2^31-1` and points the user to the
  pertinent bug report. The SPI uses long integers (32-bit) instead of
  long long integers (64-bit) for its observation numbers (not to
  mention theyare signed integers, but alas). At any rate, this de facto
  limits the number of observations to 2^31-1.
  • Loading branch information
mcaceresb committed Aug 19, 2018
2 parents 7c586fe + 8d8e928 commit 1714770
Show file tree
Hide file tree
Showing 23 changed files with 238 additions and 78 deletions.
2 changes: 1 addition & 1 deletion .appveyor.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
version: "generic-1.0.0-{build}"
version: "generic-1.0.3-{build}"

environment:
matrix:
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ implementation of collapse, pctile, xtile, contract, egen, isid,
levelsof, duplicates, and unique/distinct using C plugins for a massive
speed improvement.

`version 1.0.2 08Aug2018`
Builds: Linux, OSX [![Travis Build Status](https://travis-ci.org/mcaceresb/stata-gtools.svg?branch=master)](https://travis-ci.org/mcaceresb/stata-gtools),
Windows (Cygwin) [![Appveyor Build status](https://ci.appveyor.com/api/projects/status/2bh1q9bulx3pl81p/branch/master?svg=true)](https://ci.appveyor.com/project/mcaceresb/stata-gtools)
`version 1.0.3 18Aug2018`
Builds: Linux, OSX [![Travis Build Status](https://travis-ci.org/mcaceresb/stata-gtools.svg?branch=develop)](https://travis-ci.org/mcaceresb/stata-gtools),
Windows (Cygwin) [![Appveyor Build status](https://ci.appveyor.com/api/projects/status/2bh1q9bulx3pl81p/branch/develop?svg=true)](https://ci.appveyor.com/project/mcaceresb/stata-gtools)

Faster Stata for Common Operations
----------------------------------
Expand Down Expand Up @@ -177,7 +177,7 @@ gtools, upgrade
You can also install the github version directly
```stata
local github "https://raw.githubusercontent.com"
net install gtools, from(`github'/mcaceresb/stata-gtools/master/build/)
net install gtools, from(`github'/mcaceresb/stata-gtools/develop/build/)
```

### Examples
Expand Down
16 changes: 14 additions & 2 deletions build/_gtools_internal.ado
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
*! version 1.0.1 23Jul2018 Mauricio Caceres Bravo, [email protected]
*! Encode varlist using Jenkin's 128-bit spookyhash via C plugins
*! version 1.0.3 18Aug2018 Mauricio Caceres Bravo, [email protected]
*! gtools function internals

* rc 17000
* rc 17001 - no observations
Expand All @@ -8,6 +8,7 @@
* rc 17004 - strL variables could not be compressed
* rc 17005 - strL contains binary data
* rc 17006 - strL variables uknown error
* rc 17800 - More than 2^31-1 obs
* rc 17459
* rc 17900
* rc 17999
Expand Down Expand Up @@ -47,6 +48,17 @@ program _gtools_internal, rclass
exit 17001
}

if ( `=_N > 2^31-1' ) {
local nmax = trim("`: disp %21.0gc 2^31-1'")
di as err `"too many observations"'
di as err `""'
di as err `"A Stata bug prevents gtools from working with more than `nmax' observations."'
di as err `"See {browse "https://www.statalist.org/forums/forum/general-stata-discussion/general/1457637"}"'
di as err `"and {browse "https://github.com/mcaceresb/stata-gtools/issues/43"}"'
clean_all 17800
exit 17800
}

local 00 `0'

* Time the entire function execution
Expand Down
7 changes: 7 additions & 0 deletions build/changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
Change Log
==========

## gtools-1.0.3 (2018-08-18)

### Bug fixes

* Gtools exits with error if `_N > 2^31-1` and points the user to the
pertinent bug report.

## gtools-1.0.2 (2018-08-08)

### Enhancements
Expand Down
52 changes: 41 additions & 11 deletions build/gtools.ado
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
*! version 1.0.2 08Aug2018 Mauricio Caceres Bravo, [email protected]
*! version 1.0.3 18Aug2018 Mauricio Caceres Bravo, [email protected]
*! Program for managing the gtools package installation

capture program drop gtools
Expand All @@ -20,10 +20,15 @@ program gtools
showcase ///
examples ///
test ///
TESTs(str) ///
branch(str) ///
]

if ( "`branch'" == "" ) local branch master
if ( `"`branch'"' == "" ) local branch master
if !inlist(`"`branch'"', "develop", "master", "osx") {
disp as err "unknown branch `branch'; available: develop master osx"
exit 198
}

local cwd `c(pwd)'
local github https://raw.githubusercontent.com/mcaceresb/stata-gtools/`branch'
Expand All @@ -41,7 +46,7 @@ program gtools
gtools_licenses
}

if ( `"`dependencies'`hashlib'`install_latest'`upgrade'`dll'`showcase'`examples'`test'"' == `""' ) {
if ( `"`dependencies'`hashlib'`install_latest'`upgrade'`dll'`showcase'`examples'`test'`tests'"' == `""' ) {
exit 0
}
}
Expand Down Expand Up @@ -83,7 +88,7 @@ program gtools
di as txt "Success!"
cd `"`cwd'"'

if ( `"`hashlib'`install_latest'`upgrade'`dll'`showcase'`examples'`test'"' == `""' ) {
if ( `"`hashlib'`install_latest'`upgrade'`dll'`showcase'`examples'`test'`tests'"' == `""' ) {
exit 0
}
}
Expand All @@ -92,7 +97,7 @@ program gtools
cap net uninstall gtools
net install gtools, from(`github'/build) replace
* gtools, dependencies replace
if ( `"`hashlib'`dll'`showcase'`examples'`test'"' == `""' ) {
if ( `"`hashlib'`dll'`showcase'`examples'`test'`tests'"' == `""' ) {
exit 0
}
}
Expand Down Expand Up @@ -155,30 +160,55 @@ program gtools
}
}
else local hashlib spookyhash.dll
if ( `"`showcase'`examples'`test'"' == `""' ) {
if ( `"`showcase'`examples'`test'`tests'"' == `""' ) {
exit 0
}
}
else if ( `hashusr' | ("`dll'" == "dll") ) {
di as txt "-gtools, hashlib()- and -gtools, dll- only on Windows."
if ( `"`showcase'`examples'`test'"' == `""' ) {
if ( `"`showcase'`examples'`test'`tests'"' == `""' ) {
exit 0
}
}

if ( "`showcase'`examples'" != "" ) {
gtools_showcase
if ( "`test'" == "" ) {
if ( "`test'`tests'" == "" ) {
exit 0
}
}

if ( "`test'" != "" ) {
disp as txt "{bf:WARNING:} Unit tests from branch `branch' take 1-3 hours!"
if ( "`test'`tests'" != "" ) {
local t_hours comparisons
local t_days bench_full
local t_known dependencies basic_checks comparisons switches bench_test bench_full
local t_extra: list tests - t_known

if ( `:list sizeof t_extra' ) {
disp `"(uknown tests detected: `t_extra'; will try to run anyway)"'
}

if ( "`tests'" == "" ) {
disp as txt "{bf:WARNING:} Default unit tests from branch `branch' can take several"
disp as txt "hours. See {help gtools:help gtools} for details on unit testing."
}
else if ( `:list t_hours in tests' ) {
disp as txt "{bf:WARNING:} Unit tests"
disp as txt _n(1) " `tests'" _n(1)
disp as txt "from branch master can take several hours. See {help gtools:help gtools} for details."
}
else if ( `:list t_days in tests' ) {
disp as txt "{bf:WARNING:} Unit tests"
disp as txt _n(1) " `tests'" _n(1)
disp as txt "from branch master can take more than a day. See {help gtools:help gtools} for details."
}
else {
disp as txt "{bf:Note:} Unit tests '`tests'' from branch `branch'."
}
disp as txt "Are you sure you want to run them? (yes/no)", _request(GTOOLS_TESTS)
if inlist(`"${GTOOLS_TESTS}"', "y", "yes") {
global GTOOLS_TESTS
cap noi do `github'/build/gtools_tests.do
cap noi do `github'/build/gtools_tests.do `tests'
exit _rc
}
else {
Expand Down
4 changes: 2 additions & 2 deletions build/gtools.pkg
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
v 1.0.2
v 1.0.3
d
d 'GTOOLS': Faster implementation of common Stata commands optimized for large datasets
d
Expand Down Expand Up @@ -45,7 +45,7 @@ d
d Author: Mauricio Caceres Bravo
d Support: email [email protected]
d
d Distribution-Date: 20180808
d Distribution-Date: 20180818
d
f _gtools_internal.ado
f gcollapse.ado
Expand Down
21 changes: 19 additions & 2 deletions build/gtools.sthlp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{smcl}
{* *! version 1.0.2 08Aug2018}{...}
{* *! version 1.0.3 18Aug2018}{...}
{viewerdialog gtools "dialog gtools"}{...}
{vieweralsosee "[R] gtools" "mansection R gtools"}{...}
{viewerjumpto "Syntax" "gtools##syntax"}{...}
Expand Down Expand Up @@ -82,7 +82,7 @@ traditional stata commands. The following are available as part of gtools
{p_end}
{synopt :{opt showcase}}Alias for {opt examples}.
{p_end}
{synopt :{opt test}}Run gtools unit tests (1-3h) from the specified github branch (default is master).
{synopt :{opt test[(tests)]}}Run unit tests, optionally specifying which tests to run.
{p_end}
{synopt :{opth branch(str)}}Github branch to use (defualt is master).
{p_end}
Expand Down Expand Up @@ -138,6 +138,23 @@ is required for the plugin to execute correctly.
{opt examples} (alias {opt showcase}) prints examples of how to use
various gtools functions.

{phang}

{opt test[(tests)]} Run unit tests, optionally specifying which tests
to run. Tests available are: dependencies, basic_checks, bench_test,
comparisons, switches, bench_full. A good set of "small" tests which
take 10-20 minutes are {cmd: dependencies basic_checks bench_test}.
By default, however, the first 5 tests are run, which take 1-3h. The
bulk of that time is from {bf:comparisons}, which compares the results
from gtools to that of various native counterparts under several
different conditions. {bf:bench_full} is not run by default because this
benchmarks gtools against stata using modestly-sized data (millions).
Some stata commands are very slow under some of the benchmarks, meaning
this can take well over a day.

{phang}
{opth branch(str)} Github branch to use (defualt is master).

{marker author}{...}
{title:Author}

Expand Down
Binary file modified build/gtools_unix_v2.plugin
Binary file not shown.
Binary file modified build/gtools_unix_v3.plugin
Binary file not shown.
2 changes: 1 addition & 1 deletion build/stata.toc
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
v 1.0.0
v 1.0.3
d Mauricio Caceres Bravo, [email protected]
p 'GTOOLS': Generic stata repo that uses C plugins
7 changes: 7 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
Change Log
==========

## gtools-1.0.3 (2018-08-18)

### Bug fixes

* Gtools exits with error if `_N > 2^31-1` and points the user to the
pertinent bug report.

## gtools-1.0.2 (2018-08-08)

### Enhancements
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ implementation of collapse, pctile, xtile, contract, egen, isid,
levelsof, duplicates, and unique/distinct using C plugins for a massive
speed improvement.

`version 1.0.2 08Aug2018`
`version 1.0.3 18Aug2018`
Builds: Linux, OSX [![Travis Build Status](https://travis-ci.org/mcaceresb/stata-gtools.svg?branch=master)](https://travis-ci.org/mcaceresb/stata-gtools),
Windows (Cygwin) [![Appveyor Build status](https://ci.appveyor.com/api/projects/status/2bh1q9bulx3pl81p/branch/master?svg=true)](https://ci.appveyor.com/project/mcaceresb/stata-gtools)

Expand Down
21 changes: 19 additions & 2 deletions docs/stata/gtools.sthlp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{smcl}
{* *! version 1.0.2 08Aug2018}{...}
{* *! version 1.0.3 18Aug2018}{...}
{viewerdialog gtools "dialog gtools"}{...}
{vieweralsosee "[R] gtools" "mansection R gtools"}{...}
{viewerjumpto "Syntax" "gtools##syntax"}{...}
Expand Down Expand Up @@ -82,7 +82,7 @@ traditional stata commands. The following are available as part of gtools
{p_end}
{synopt :{opt showcase}}Alias for {opt examples}.
{p_end}
{synopt :{opt test}}Run gtools unit tests (1-3h) from the specified github branch (default is master).
{synopt :{opt test[(tests)]}}Run unit tests, optionally specifying which tests to run.
{p_end}
{synopt :{opth branch(str)}}Github branch to use (defualt is master).
{p_end}
Expand Down Expand Up @@ -138,6 +138,23 @@ is required for the plugin to execute correctly.
{opt examples} (alias {opt showcase}) prints examples of how to use
various gtools functions.

{phang}

{opt test[(tests)]} Run unit tests, optionally specifying which tests
to run. Tests available are: dependencies, basic_checks, bench_test,
comparisons, switches, bench_full. A good set of "small" tests which
take 10-20 minutes are {cmd: dependencies basic_checks bench_test}.
By default, however, the first 5 tests are run, which take 1-3h. The
bulk of that time is from {bf:comparisons}, which compares the results
from gtools to that of various native counterparts under several
different conditions. {bf:bench_full} is not run by default because this
benchmarks gtools against stata using modestly-sized data (millions).
Some stata commands are very slow under some of the benchmarks, meaning
this can take well over a day.

{phang}
{opth branch(str)} Github branch to use (defualt is master).

{marker author}{...}
{title:Author}

Expand Down
12 changes: 11 additions & 1 deletion docs/usage/gtools.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,16 @@ Options

- `examples` (alias `showcase`) Print examples of how to use available gtools functions.

- `test` Runs the gtools unit tests (1-3h) from the specified github branch (default is master).
- `test[(str)]` Run unit tests, optionally specifying which tests to run. Tests
available are: `dependencies`, `basic_checks`, `bench_test`,
`comparisons`, `switches`, `bench_full`. A good set of "small" tests
which take 10-20 minutes are `dependencies basic_checks bench_test`. By
default, however, the first 5 tests are run, which take 1-3h. The bulk
of that time is from `comparisons`, which compares the results from
gtools to that of various native counterparts under several different
conditions. `bench_full` is not run by default because this benchmarks
gtools against stata using modestly-sized data (millions). Some stata
commands are very slow under some of the benchmarks, meaning this can
take well over a day.

- `branch(str)` Github branch to use (default is master).
Binary file modified lib/plugin/gtools_unix_v2.plugin
Binary file not shown.
Binary file modified lib/plugin/gtools_unix_v3.plugin
Binary file not shown.
Loading

0 comments on commit 1714770

Please sign in to comment.