This is a tool for type generation with respect to json schema in yaml fromat.
install stack and ghc via ghcup.
sudo apt-get install cmake gcc g++ curl wget
sudo apt-get install python3-dev libgmp3-dev libtinfo-dev
pip install jsonschema
mkdir path/to/generated-api #output directory
stack build # builds code generation tool
stack exec -- config-generation-exe --input "path/to/api/root" --output "path/to/generated-api" --repository_root "path/to/api/repository/root"
Option repository_root
is needed if there are absolute paths relative to some directory in json-schema includes. In that case this direcory must be supplied in --reposotory_root
argument.
There is an option to overwrite some yaml schemas with particular type. If you want to do so, then specify yaml file after --overwritten_files
which contains paths and types. As an example you might take overwritten.yaml
file in this repository
If you desire you can add haskell/overwrite_type
field to your yaml
file and it will be treated as specified in --overwritten_files
option.
There are some more options, you can look at them by
stack exec -- config-generation-exe --help
Now there is a cabal project at path/to/generated-api
with test
and src
directories and with cbits
Validation function uses json
and jsonschema
, so
pip install json jsonschema
The important part is that if there are no such dependecies installed then unsafe_validatete
exposed to haskell will just return False
every time. WITHOUT any indication that it is a missing dependancy problem, not a validation one.
Three functions are exposed from Python to C through pybind11 and then from C to haskell through ffi.
foreign import ccall "unsafe_validate" unsafe_validate :: CString -> CString -> IO CBool
foreign import ccall "start_python" start_python :: IO ()
foreign import ccall "end_python" end_python :: IO ()
extern "C"{
//initialize python interpretator
void start_python();
// validate via jsonschema.validate
bool unsafe_validate(const char* object, const char* scheme);
//shut down python interpretator
void end_python();
};
See test/Spec.hs
to for an example(it is commented out now).
There is pybind11 in use, so you need to install everything it needs. However there is no need to install pybind11
, it will fetch itself locally during cmake ..
cd path/to/generated-api/cbits/c_validate
mkdir cmake-build-debug && cd cmake-build-debug
cmake .. && make
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:absolute/path/to/generated-api/cbits/c_validate/cmake-build-debug
cabal build
cabal v2-test --extra-lib-dirs=absolute/path/to/generated-api/cbits/c_validate/cmake-build-debug
Prepare for a long wait.
In this part there will be overview of what is going on when this tool is working. I will try my best to specify all corner cases, bug and weird behaviours. At first let me say that there are some.
The tool starts with parsing of all yamls in specified directory. The first gotcha is also here.
GOTCHA
: all yaml
files that do not have schema
folder in their paths are filtered out. Filteriing is based on absolute path. For details look at function Main.collectFiles
Parsing is done by libyaml
via some conduit machnery for include
inlining. During ths phase several things are done:
- all
!include path
are inlined - as
!include path
are inlined new fieldhaskell/origin
is injected to remeber path where it is inlined from. - all
maxItems
andminItems
fields are dropped - all
additionalProperties
fields are dropped - all invalid paths are treated as from
repository_root
- adds
haskell/overwrite_type
fields to w.r.t--overwritten_files
option
GOTCHA
: If you are getting /path/to/yaml/path/to/yaml/file.yaml is not found
that usually means that you misspelled path/to/yaml/file.yaml
in the first place. As it was checked before appending path to the repository root.
For more details go here: Data.TransportTypes.Parsing.IncludeInjection
As parsing is done then you got Data.Yaml.Value
on your hands and it transforms via Data.Yaml.FromJSON
to Data.TransportTypes.Parsing.ParserResult
object.
In terms of dependencies yaml files form a forest, but at first I thought that tree will be enough. ParserResult
represents a forst with mutual node to connect all the trees. The node is stored in mainType
and the forst is in deps
. Later this node will be discarded in Data.TransportTypes.CodeGen.Hylo.BuildUp.build
as it is just an artifact of poor design.
The transformation of Data.Yaml.Value
to ParserResult
starts in parseJSON
and using recoursive function Data.TransportTypes.Parsing.parseDispatch
. The core idea is to check at first if object we are dispatching on is included via "haskell/origin"
field and at second if we have already met it, thus a state with keymap: ParserState
.
Then to conclude the parsing some postprocessing is required. It is performed via Data.TransportTypes.Parsing.postprocessParserResult
. Main goal of such postprocessing is to remove common file prefix of all files. It also changes dashes to underscores.
GOTCHA
: Dash to underscore transformation is never reverted, so if you had dashes in yamls they are gone forever after this step.
Parsing is done, so it is time for building. Now we have ParserResult
on ower hands and essencially it represents a tree. A good thing to do with a tree is to traverse it from the leaves to the root. This is exectly what happening during building phase.
Data.TransportTypes.CodeGen.Hylo.BreakDown
is splitting the tree over Data.TransportTypes.ModuleParts.ModuleParts
to
Data.TransportTypes.CodeGen.Hylo.Structure.NodeF
- tree itselfData.TransportTypes.CodeGen.Hylo.Structure.Payload
- stored data
Then Data.TransportTypes.CodeGen.Hylo.BuildUp.buildUp
is traversing this tree in hylomorphism-like fashion. But I made a design mistake and this tree is actually a forest, so the beauty of hylomorphism is not preserverd. Instead there are a bunch of recursive functions inside Data.TransportTypes.CodeGen.Hylo.BuildUp
module to build not only interconnected part of the forest, but the disconnected parts also.
One function to rule them all: Data.TransportTypes.CodeGen.Hylo.BuildUp.build
. The same one which discardes mutual node.
Due to the survived nature of hylomorphism traversing process is separated with building process. As our goal is to convert everything to HsModule'
s and write them to files it is very convinient, because tests and types themselves are generated differently, but traversing process is them same.
Traversing is organized with the help of Data.TransportTypes.CodeGen.Hylo.BuildUp.Ctx
it is scary, but everything inside it has a perpose.
type Ctx a = ReaderT U.ModulePrefix (StateT GeneratorState (Except String)) a
Reader is for remebering the path to our module. As we are starting from the leaves there is no way to know the path we got here. Thus we need a Reader to delay the prefix evaluation until we will reach the root.
State is for remebering includes. There is no need to duplicate modules that including in multiple places. Thus the hashmap to remeber which module we have already met and buit.
Except is needed if something will go wrong.
Test generation is done inside Data.TransportTypes.CodeGen.TestGen
module.
-
buildSpec
function is building module forSpec.hs
where all tests are called -
buildTest
function making tests themself andTest.QuickCheck.Arbitrary
instances- toJSON test
At frist the sample is generated via
generic-random
andquickcheck-instances
. Then it is converted to json and transformed tostd::string
, and them to pythonstr
along with schema which is stored inside test file as string literal. The sample is validated against schema via pythonjsonschema
package. The result is returned asBool
.GOTCHA
: There are no exceptions transported between python and haskell, so if something goes wrong validateion result is justFalse
with no signal that something went wrong.GOTCHA
: As there are differences in yaml convention incheops
project and official json schema draft to have meaningfull validation alloneOf
's are transformed toanyOf
s via string transformation defined inwhere
clause of schema literal in test files.- fromJSON test
Checks that
fromJSON . toJSON == (fromJSON . toJSON) . (fromJSON . toJSON)
Module generation is done inside Data.TransportTypes.CodeGen.TypeGen
buildTypeDecl
builds declaration for the typeInstanceGen.FromJson.buildFromJSONInstance
buildsFromJSON
instanceInstanceGen.ToJson.buildToJSONInstance
buildsToJSON
instanceInstanceGen.ToJson.buildToSchemaInstance
buildsToSchema
instance
There are some commentaries inside these files on cases which are not representable by json schema, but representable via Data.TransportTypes.TypeRep.TypeRep
data structure.