-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mscoco transformations to BigML-COCO and documentation changes #261
Changes from 9 commits
98c4401
6162d36
15b44b9
c5aefe1
0b719b4
a77792e
a0cfa3a
3aa6bd6
57f5902
a3393de
0e79c2e
8474fe8
17469d3
85c7e16
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,16 +14,6 @@ the | |
BigMLer is open sourced under the `Apache License, Version | ||
2.0 <http://www.apache.org/licenses/LICENSE-2.0.html>`_. | ||
|
||
Support | ||
======= | ||
|
||
Please report problems and bugs to our `BigML.io issue | ||
tracker <https://github.com/bigmlcom/io/issues>`_. | ||
|
||
Discussions about the different bindings take place in the general | ||
`BigML mailing list <http://groups.google.com/group/bigml>`_. Or join us | ||
in our `Campfire chatroom <https://bigmlinc.campfirenow.com/f20a0>`_. | ||
|
||
Requirements | ||
============ | ||
|
||
|
@@ -54,7 +44,7 @@ using: | |
The external libraries used in this case exist for the majority of recent | ||
Operating System versions. Still, some of them might need especific | ||
compiler versions or dlls, so their installation may require an additional | ||
setup effort. | ||
setup effort and will not be supported by default. | ||
|
||
The full set of libraries can be installed using | ||
|
||
|
@@ -146,32 +136,26 @@ For a detailed description of authentication instructions on Windows see the | |
BigMLer on Windows | ||
================== | ||
|
||
To install BigMLer on Windows environments, you'll need `Python for Windows | ||
(v.2.7.x) <http://www.python.org/download/>`_ installed. | ||
|
||
In addition to that, you'll need the ``pip`` tool to install BigMLer. To | ||
install pip, first you need to open your command line window (write ``cmd`` in | ||
the input field that appears when you click on ``Start`` and hit ``enter``), | ||
download this `python file <http://python-distribute.org/distribute_setup.py>`_ | ||
and execute it | ||
|
||
.. code-block:: bash | ||
|
||
c:\Python27\python.exe distribute_setup.py | ||
|
||
After that, you'll be able to install ``pip`` by typing the following command | ||
To install BigMLer on Windows environments, you'll Python installed. | ||
The code has been tested with Python 3.10 and you can create a conda | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: maybe make "conda" italics if it's usually spelled lowercase? or maybe not, if everybody knows wthat that is.. |
||
environment with that Python version or download it from `Python for Windows | ||
<http://www.python.org/download/>`_ and install it. In the last case, you'll | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the latter case |
||
also need too install the ``pip`` tool to install BigMLer. | ||
|
||
.. code-block:: bash | ||
|
||
c:\Python27\Scripts\easy_install.exe pip | ||
To install ``pip``, first you need to open your command line window | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a command line terminal window (?) |
||
(write ``cmd`` in | ||
the input field that appears when you click on ``Start`` and hit ``enter``). | ||
Then you can follow the steps described, for example, in this `guide | ||
<https://monovm.com/blog/how-to-install-pip-on-windows-linux/#How-to-install-PIP-on-Windows?-[A-Step-by-Step-Guide]>`_ | ||
to install its latest version. | ||
|
||
And finally, to install BigMLer, just type | ||
And finally, to install BigMLer in its basic capacities, just type | ||
|
||
.. code-block:: bash | ||
|
||
c:\Python27\Scripts\pip.exe install bigmler | ||
python -m pip install bigmler | ||
|
||
and BigMLer should be installed in your computer. Then | ||
and BigMLer should be installed in your computer or conda environment. Then | ||
issuing | ||
|
||
.. code-block:: bash | ||
|
@@ -180,6 +164,11 @@ issuing | |
|
||
should show BigMLer version information. | ||
|
||
Extensions of BigMLer to use images are not supported in Windows by default. | ||
The libraries needed for those models are not available usually for that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are usually not available |
||
operating system. If your Machine Learning project involves images, we | ||
recommend that you choose a Linux based operating system. | ||
|
||
Finally, to start using BigMLer to handle your BigML resources, you need to | ||
set your credentials in BigML for authentication. If you want them to be | ||
permanently stored in your system, use | ||
|
@@ -189,6 +178,9 @@ permanently stored in your system, use | |
setx BIGML_USERNAME myusername | ||
setx BIGML_API_KEY ae579e7e53fb9abd646a6ff8aa99d4afe83ac291 | ||
|
||
Note that ``setx`` will not change the environment variables of your actual | ||
console, so you will need to open a new one to start using them. | ||
|
||
|
||
BigML Development Mode | ||
====================== | ||
|
@@ -347,3 +339,13 @@ Additional Information | |
|
||
For additional information, see | ||
the `full documentation for BigMLer on Read the Docs <http://bigmler.readthedocs.org>`_. | ||
|
||
|
||
Support | ||
======= | ||
|
||
Please report problems and bugs to our `BigML.io issue | ||
tracker <https://github.com/bigmlcom/io/issues>`_. | ||
|
||
Discussions about the different bindings take place in the general | ||
`BigML mailing list <http://groups.google.com/group/bigml>`_. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
# -*- coding: utf-8 -*- | ||
__version__ = '5.8.1' | ||
__version__ = '5.9.0' |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -334,12 +334,16 @@ def get_source_options(defaults=None): | |
'action': 'store', | ||
'dest': 'annotations_language', | ||
'default': defaults.get('annotations_language', None), | ||
'choices': ["VOC", "YOLO"], | ||
'choices': ["VOC", "YOLO", "COCO"], | ||
'help': ("Language used to provide the annotations for images." | ||
"Annotations are expected to be provided using " | ||
"on file per image. The --train option should point" | ||
"one file per image. The --train option should point" | ||
" to the directory that contains both images and" | ||
" the corresponding annotations.")}, | ||
" the corresponding annotations, unless some " | ||
" folder attribute is provided in each" | ||
" annotation. In that case it should point to" | ||
" the folder parent directory and --anotations-dir" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. folder's |
||
" should be used to point to the annotations files.")}, | ||
|
||
# Annotations file | ||
# File that contains annotations for images | ||
|
@@ -356,7 +360,15 @@ def get_source_options(defaults=None): | |
'action': 'store', | ||
'dest': 'annotations_dir', | ||
'default': defaults.get('annotations_dir', None), | ||
'help': "Directory for individual annotation files."}, | ||
'help': ("Directory for individual annotation files." | ||
" Used when annotations are provided using " | ||
"one file per image. The --train option should point" | ||
" to the directory that contains both images and" | ||
" the corresponding annotations, unless some " | ||
" folder attribute is provided in each" | ||
" annotation. In that case it should point to" | ||
" the folder parent directory and --anotations-dir" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. folder's (IIUC) |
||
" should be used to point to the annotations files.")}, | ||
|
||
# Images file | ||
# Compressed file with images used as reference for annotations | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -51,6 +51,11 @@ def relative_path(base_dir, absolute_path): | |
return os.path.relpath(absolute_path, base_dir) | ||
|
||
|
||
def get_file_ext(filename): | ||
"""Getting the file extension in lowercase and without the dot """ | ||
return os.path.splitext(filename)[1].lower()[1:] | ||
|
||
|
||
def fields_from_annotations(annotations_file): | ||
"""Infers the type of the fields that will contain the annotations | ||
in an annotations file. | ||
|
@@ -116,9 +121,8 @@ def bigml_metadata(args, images_list=None, new_fields=None): | |
files = glob.glob(os.path.join(args.images_dir, "**"), | ||
recursive=True) | ||
images_list = [filename for | ||
filename in files if | ||
os.path.splitext(filename)[1].lower()[1:] in | ||
IMAGE_EXTENSIONS] | ||
filename in files if get_file_ext(filename) | ||
in IMAGE_EXTENSIONS] | ||
|
||
if images_list: | ||
if not os.path.exists(zip_path): | ||
|
@@ -157,16 +161,22 @@ def bigml_metadata(args, images_list=None, new_fields=None): | |
|
||
|
||
def bigml_coco_file(args, session_file): | ||
"""Translates from alternative annotations format, like VOC and YOLO to | ||
the format accepted by BigML | ||
"""Translates from alternative annotations format, like VOC, YOLO or | ||
MSCOCO to the format accepted by BigML | ||
|
||
""" | ||
|
||
if args.annotations_file is not None: | ||
args.original_annotations_file = args.annotations_file | ||
args.annotations_file = os.path.join(args.output_dir, "annotations.json") | ||
filenames = voc_to_cocojson(args.annotations_dir, args, | ||
session_file) \ | ||
if args.annotations_language == "VOC" else \ | ||
yolo_to_cocojson(args.annotations_dir, args, session_file) | ||
if args.annotations_language == "VOC": | ||
filenames = voc_to_cocojson(args.annotations_dir, args, session_file) | ||
elif args.annotations_language == "YOLO": | ||
filenames = yolo_to_cocojson(args.annotations_dir, args, session_file) | ||
elif args.annotations_language == "COCO": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i am neutral about it, but have you considered naming it MSCOCO to be explicit about its not being pure COCO? or is it the case that the only COCO is MSCOCO? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I though so too, but COCO is shorter and from what I saw is most common. |
||
filenames = mscoco_to_cocojson(args.original_annotations_file, | ||
args, session_file) | ||
|
||
return bigml_metadata(args, images_list=filenames, | ||
new_fields=[{"name": "boxes", "optype": "regions"}]) | ||
|
||
|
@@ -261,8 +271,7 @@ def yolo_to_cocojson(yolo_dir, args, session_file): | |
filenames = glob.glob(os.path.join(images_dir, "**"), | ||
recursive=True) | ||
filenames = [os.path.abspath(filename) for | ||
filename in filenames if | ||
os.path.splitext(filename)[1].lower() in | ||
filename in filenames if get_file_ext(filename) in | ||
IMAGE_EXTENSIONS] | ||
|
||
## Read yolo annotation txt file | ||
|
@@ -299,8 +308,8 @@ def yolo_to_cocojson(yolo_dir, args, session_file): | |
## the last one in the matched_file list is used | ||
for a_file in matched_files: | ||
filenames.append(a_file) | ||
ext = os.path.splitext(a_file)[1] | ||
if ext.lower() in IMAGE_EXTENSIONS: | ||
ext = get_file_ext(a_file) | ||
if ext in IMAGE_EXTENSIONS: | ||
image_filename = a_file | ||
else: | ||
warnings += 1 | ||
|
@@ -514,8 +523,7 @@ def voc_to_cocojson(voc_dir, args, session_file): | |
filenames = glob.glob(os.path.join(args.images_dir, "**"), | ||
recursive=True) | ||
filenames = [os.path.abspath(filename) for | ||
filename in filenames if | ||
os.path.splitext(filename)[1].lower() in | ||
filename in filenames if get_file_ext(filename) in | ||
IMAGE_EXTENSIONS] | ||
|
||
for a_file in annotation_file_list: | ||
|
@@ -568,3 +576,89 @@ def voc_to_cocojson(voc_dir, args, session_file): | |
|
||
return [relative_path(args.images_dir, filename) for filename in \ | ||
filenames] | ||
|
||
def mscoco_to_cocojson(mscoco_file, args, session_file): | ||
"""Translates annotations from a MS COCO format, where each image is | ||
associated with a JSON file that contains one object per associated info. | ||
Maps images, categories and annotations to image file names, labels and | ||
regions. It returns the list of images it refers to. | ||
|
||
""" | ||
|
||
output_json_array = [] | ||
|
||
filenames = [] | ||
labels = {} | ||
images = {} | ||
|
||
logfile_name = args.annotations_file + ".log" | ||
|
||
with open(logfile_name, "w") as logfile: | ||
|
||
warnings = 0 | ||
message = "Start converting COCO file from " + mscoco_file + "\n" | ||
u.log_message(message, session_file, console=args.verbosity) | ||
logfile.write("\n\n%s\n" % message) | ||
|
||
# Loading the MS-COCO json into memory | ||
|
||
with open(mscoco_file, "r") as handle: | ||
data = json.load(handle) | ||
|
||
# Images will be found either in the images_dir file or where | ||
# the annotation file points to | ||
if args.images_dir is not None and os.path.exists(args.images_dir): | ||
filenames = glob.glob(os.path.join(args.images_dir, "**"), | ||
recursive=True) | ||
paths = [os.path.abspath(filename) for | ||
filename in filenames if get_file_ext(filename) in | ||
IMAGE_EXTENSIONS] | ||
filenames = [os.path.basename(path) for path in paths] | ||
|
||
# Extracting the file_name and id into a dict | ||
images = dict([[image['id'], | ||
{ "file": image['file_name'], "boxes": [] }] | ||
for image in data['images'] if image['file_name'] in | ||
filenames]) | ||
if data.get("categories") and data['categories'][0].get("name"): | ||
# Extract the image category labels into a dict | ||
labels = dict([[category['id'], | ||
{ "name": category['name'], | ||
"super": category.get('supercategory', "") } ] | ||
for category in data['categories']]) | ||
# Adding the regions data | ||
if data.get('annotations'): | ||
for annotation in data['annotations']: | ||
images[annotation["image_id"]]["boxes"].append({ | ||
"label": labels[annotation['category_id']]['name'], | ||
"xmin": int(annotation["bbox"][0]), | ||
"ymin": int(annotation["bbox"][1]), | ||
"xmax": int(annotation["bbox"][0] + annotation["bbox"][2]), | ||
"ymax": int(annotation["bbox"][1] + annotation["bbox"][3]) | ||
}) | ||
|
||
if labels[annotation['category_id']]['super']: | ||
images[annotation["image_id"]]["boxes"].append({ | ||
"label": labels[annotation['category_id']]['super'], | ||
"xmin": int(annotation["bbox"][0]), | ||
"ymin": int(annotation["bbox"][1]), | ||
"xmax": int(annotation["bbox"][0] + | ||
annotation["bbox"][2]), | ||
"ymax": int(annotation["bbox"][1] + | ||
annotation["bbox"][3]) | ||
}) | ||
|
||
output_json_array = [images[image_id] for image_id in images.keys()] | ||
|
||
if warnings > 0: | ||
message = f"\nThere are {warnings} warnings, " \ | ||
f"see the log file {logfile_name}\n" | ||
u.log_message(message, session_file, console=args.verbosity) | ||
|
||
filenames = [image['file'] for image in output_json_array] | ||
|
||
with open(args.annotations_file, 'w') as handler: | ||
json.dump(output_json_array, handler, indent=2) | ||
|
||
return [relative_path(args.images_dir, filename) for filename in \ | ||
filenames] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: you'll need