Skip to content

CSV2COCO module

Iterable2COCO

Bases: object

class providing methods for parsing a "flat" iterable (e.g. a csv) of annotations into COCO format. Each "row" should contain things such as the image filename, and the annotation information (e.g a bounding box or keypoints and a category label)

Parameters:

Name Type Description Default
config Iterable2COCOConfig

a configuration dictionary detailing which columns in each row correspond to various COCO features such as filename and bounding box coordinates.

required

Attributes:

Name Type Description
bbox_parser IterableBBoxParser

a helper for parsing bounding boxes and the concomitant configuration

keypoint_parser IterableKeypointParser

a helper for parsing keypoints and the concomitant configuration

Source code in pycocowriter/csv2coco.py
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
class Iterable2COCO(object):
    '''
    class providing methods for parsing a "flat" iterable (e.g. a csv) of annotations into COCO
    format.  Each "row" should contain things such as the image filename, 
    and the annotation information (e.g a bounding box or keypoints and a category label)

    Parameters
    ----------
    config: Iterable2COCOConfig
        a configuration dictionary detailing which columns in each row correspond to various COCO
        features such as filename and bounding box coordinates.

    Attributes
    ----------
    bbox_parser: IterableBBoxParser
        a helper for parsing bounding boxes and the concomitant configuration
    keypoint_parser: IterableKeypointParser
        a helper for parsing keypoints and the concomitant configuration
    '''

    def __init__(self, config: Iterable2COCOConfig):
        self.config = config
        self.bbox_parser = IterableBBoxParser(config)
        self.keypoint_parser = IterableKeypointParser(config)

    def _get_scalar(self, field: str, row: Sequence):
        '''
        get a single value from a row given the field name in the configuration

        Parameters
        ----------
        field: str
            the field name as expected in the configuration, and should point to a column index.
            viz. self.config[field] should provide an index into row
        row: Sequence
            an indexable "row" e.g. from a csv file

        Returns
        -------
        scalar: any
           a single value expected for that field. 
        '''
        if field not in self.config:
            return None
        return row[self.config[field]]

    def parse(self, row_iterable: Iterable[Sequence]) -> tuple[
        list[coco.COCOImage], list[coco.COCOAnnotation], list[coco.COCOCategory]
    ]:
        '''
        parse an iterable of rows (e.g. from a csv file) containing image annotation information into
        COCO format.

        Parameters
        ----------
        row_iterable: Iterable[Sequence]
            an iterable of rows containing annotation information

        Returns
        -------
        images: list[COCOImage]
            a list of all unique images listed in the iterable, in COCO format
        annotations: list[COCOAnnotation]
            a list of all annotations listed in the iterable, correctly indexed
            against the images and categories lists
        categories: list[COCOCategory]
            a list of all unique categories listed in the iterable, in COCO format        
        '''
        categories = coco.COCOCategories()
        images = coco.COCOImages()
        annotations = []
        if 'meta' in self.config and 'skiprows' in self.config.meta:
            utils.skiprows(row_iterable, self.config.meta.skiprows)
        keypoint_names, keypoint_skeleton = self.keypoint_parser.keypoint_config()
        for row in row_iterable:
            bbox = self.bbox_parser.get_bbox(row)
            keypoints = self.keypoint_parser.get_keypoints(row)
            filename = self._get_scalar('filename', row)
            width = self._get_scalar('width', row)
            height = self._get_scalar('height', row)
            label = self._get_scalar('label', row)
            images.add(filename, width, height)
            categories.add(label, keypoint_names, keypoint_skeleton)
            annotations.append(
                coco.COCOAnnotation(
                    images.image_map[filename],
                    len(annotations),
                    categories.category_map[label],
                    bbox=bbox,
                    keypoints=keypoints
                )
            )
        return images.images, annotations, categories.categories

parse(row_iterable)

parse an iterable of rows (e.g. from a csv file) containing image annotation information into COCO format.

Parameters:

Name Type Description Default
row_iterable Iterable[Sequence]

an iterable of rows containing annotation information

required

Returns:

Name Type Description
images list[COCOImage]

a list of all unique images listed in the iterable, in COCO format

annotations list[COCOAnnotation]

a list of all annotations listed in the iterable, correctly indexed against the images and categories lists

categories list[COCOCategory]

a list of all unique categories listed in the iterable, in COCO format

Source code in pycocowriter/csv2coco.py
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
def parse(self, row_iterable: Iterable[Sequence]) -> tuple[
    list[coco.COCOImage], list[coco.COCOAnnotation], list[coco.COCOCategory]
]:
    '''
    parse an iterable of rows (e.g. from a csv file) containing image annotation information into
    COCO format.

    Parameters
    ----------
    row_iterable: Iterable[Sequence]
        an iterable of rows containing annotation information

    Returns
    -------
    images: list[COCOImage]
        a list of all unique images listed in the iterable, in COCO format
    annotations: list[COCOAnnotation]
        a list of all annotations listed in the iterable, correctly indexed
        against the images and categories lists
    categories: list[COCOCategory]
        a list of all unique categories listed in the iterable, in COCO format        
    '''
    categories = coco.COCOCategories()
    images = coco.COCOImages()
    annotations = []
    if 'meta' in self.config and 'skiprows' in self.config.meta:
        utils.skiprows(row_iterable, self.config.meta.skiprows)
    keypoint_names, keypoint_skeleton = self.keypoint_parser.keypoint_config()
    for row in row_iterable:
        bbox = self.bbox_parser.get_bbox(row)
        keypoints = self.keypoint_parser.get_keypoints(row)
        filename = self._get_scalar('filename', row)
        width = self._get_scalar('width', row)
        height = self._get_scalar('height', row)
        label = self._get_scalar('label', row)
        images.add(filename, width, height)
        categories.add(label, keypoint_names, keypoint_skeleton)
        annotations.append(
            coco.COCOAnnotation(
                images.image_map[filename],
                len(annotations),
                categories.category_map[label],
                bbox=bbox,
                keypoints=keypoints
            )
        )
    return images.images, annotations, categories.categories

Iterable2COCOConfig

Bases: AttrDict

This class validates a configuration to convert a "flat" iterable type into COCO. Because COCO has complex nested and optional types, it is not possible to have a "one-size-fits-all" flat iterable to COCO conversion. This configuration tells the converter which fields are present, and in which columns they are located. This class exists only to validate a dict as a valid configuration.

Parameters:

Name Type Description Default
config dict

a configuration dictionary adhering to SCHEMA. Set up this way to be read from .json

required

Attributes:

Name Type Description
SCHEMA dict

a jsonschema representing a valid configuration

Source code in pycocowriter/csv2coco.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
class Iterable2COCOConfig(utils.AttrDict):
    '''
    This class validates a configuration to convert a "flat" iterable type into COCO.
    Because COCO has complex nested and optional types, it is not possible to have a 
    "one-size-fits-all" flat iterable to COCO conversion.  This configuration tells
    the converter which fields are present, and in which columns they are located.
    This class exists only to validate a dict as a valid configuration.

    Parameters
    ----------
    config: dict
        a configuration dictionary adhering to SCHEMA.  Set up this way to be read from .json

    Attributes
    ----------
    SCHEMA: dict
        a jsonschema representing a valid configuration
    '''

    # TODO: This is the 'anything' schema.  Update to reflect the actual rules
    SCHEMA = {}

    def __init__(self, config: dict):
        self._validate_config(config)
        super().__init__(config)

    def _validate_config(self, config: dict):
        jsonschema.validate(config, Iterable2COCOConfig.SCHEMA)

IterableBBoxParser

Bases: object

This class is to help parse bounding boxes from "row" data. Sometimes these data are in different formats, so this class is intended to assist in dealing with these nuances

Parameters:

Name Type Description Default
config Iterable2COCOConfig

this configuration should define how (and if) bounding boxes are present in each row

required
Source code in pycocowriter/csv2coco.py
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
class IterableBBoxParser(object):
    '''
    This class is to help parse bounding boxes from "row" data.  Sometimes these data
    are in different formats, so this class is intended to assist in dealing with these nuances

    Parameters
    ----------
    config: Iterable2COCOConfig
        this configuration should define how (and if) bounding boxes are present in each row
    '''

    def __init__(self, config: Iterable2COCOConfig):
        self.config = config
        self._init_bbox_method()

    def _init_bbox_tlbr(self):
        '''
        configures the `get_bbox` method to get bounding boxes in "top left, width/height" 
        format, given bounding boxes in "top left, bottom right" format
        '''
        self.get_bbox = self._get_bbox_tlbr
        self.bbox_cols = [
            self.config.bbox_tlbr.tlx,
            self.config.bbox_tlbr.tly,
            self.config.bbox_tlbr.brx,
            self.config.bbox_tlbr.bry,
        ]

    def _init_bbox_xywh(self):
        '''
        configures the `get_bbox` method to get bounding boxes in "top left, width/height" 
        format, given bounding boxes in "top left, width/height" format
        '''
        self.get_bbox = self._get_bbox_xywh
        self.bbox_cols = [
            self.config.bbox_xywh.x,
            self.config.bbox_xywh.y,
            self.config.bbox_xywh.w,
            self.config.bbox_xywh.h,
        ]

    def _init_bbox_method(self):
        '''
        dispatches configuration of the `get_bbox` method depending on the 
        contents of the configuration file.
        '''
        if 'bbox_tlbr' in self.config:
            self._init_bbox_tlbr()
        elif 'bbox_xywh' in self.config:
            self._init_bbox_xywh()

    def get_bbox(self, row: Sequence) -> list[int, int, int, int] | None:
        '''
        this method gets overwritten in __init__ if the config has a bbox option

        Parameters
        ----------
        row: Sequence
            a row, e.g. from a csv.  The bounding box should be in some columns of this row
            as defined in the configuration

        Returns
        -------
        bbox: list[int,int,int,int]
            the bounding box as [top_left_x, top_left_y, width, height]
        '''
        return None

    def _get_bbox_tlbr(self, row: Sequence) -> list[int, int, int, int]:
        '''
        gets a bounding box in "top left, width/height" format given bounding box subsetted from a  
        row from, e.g. a csv.  The subset of columns in the input row should be defined in the config, 
        and should contain a bounding box in "top left, bottom right" format.

        Parameters
        ----------
        row: Sequence
            a row, e.g. from a csv.  The bounding box should be in some columns of this row
            as defined in the configuration

        Returns
        -------
        bbox: list[int,int,int,int]
            the bounding box as [top_left_x, top_left_y, width, height]
        '''
        return bbox_tlbr2xywh([int(float(row[i])) for i in self.bbox_cols])

    def _get_bbox_xywh(self, row: Sequence) -> list[int, int, int, int]:
        '''
        gets a bounding box in "top left, width/height" format given bounding box subsetted from a  
        row from, e.g. a csv.  The subset of columns in the input row should be defined in the config, 
        and should contain a bounding box in "top left, width/height" format.

        Parameters
        ----------
        row: Sequence
            a row, e.g. from a csv.  The bounding box should be in some columns of this row
            as defined in the configuration

        Returns
        -------
        bbox: list[int,int,int,int]
            the bounding box as [top_left_x, top_left_y, width, height]
        '''
        return [int(float(row[i])) for i in self.bbox_cols]

get_bbox(row)

this method gets overwritten in init if the config has a bbox option

Parameters:

Name Type Description Default
row Sequence

a row, e.g. from a csv. The bounding box should be in some columns of this row as defined in the configuration

required

Returns:

Name Type Description
bbox list[int, int, int, int]

the bounding box as [top_left_x, top_left_y, width, height]

Source code in pycocowriter/csv2coco.py
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
def get_bbox(self, row: Sequence) -> list[int, int, int, int] | None:
    '''
    this method gets overwritten in __init__ if the config has a bbox option

    Parameters
    ----------
    row: Sequence
        a row, e.g. from a csv.  The bounding box should be in some columns of this row
        as defined in the configuration

    Returns
    -------
    bbox: list[int,int,int,int]
        the bounding box as [top_left_x, top_left_y, width, height]
    '''
    return None

IterableKeypointParser

Bases: object

This class is to help parse keypoints from "row" data. Sometimes these data are in different formats, so this class is intended to assist in dealing with these nuances

Parameters:

Name Type Description Default
config Iterable2COCOConfig

this configuration should define how (and if) bounding boxes are present in each row

required
Source code in pycocowriter/csv2coco.py
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
class IterableKeypointParser(object):
    '''
    This class is to help parse keypoints from "row" data.  Sometimes these data
    are in different formats, so this class is intended to assist in dealing with these nuances

    Parameters
    ----------
    config: Iterable2COCOConfig
        this configuration should define how (and if) bounding boxes are present in each row
    '''
    FULLY_VISIBLE_COCO_KEYPOINT = 2

    def __init__(self, config: Iterable2COCOConfig):
        self.config = config

    def keypoint_config(self) -> tuple[list[str], list[list[int]]]:
        '''
        get the keypoint layout from the configuration file.  the coco keypoint layout is
        ['kpname1', 'kpname2', ...]
        and also a skeleton
        [edge1, edge2, ...]
        where edges are two-tuples of 1-indexed indexes of keypoints.  For example, if keypoints are:
        ['hip', 'knee', 'ankle'],
        the skeleton would be:
        [[1,2],[2,3]] because the hip has an edge with the knee, and the knee has an edge to the ankle.

        both of these items should be defined in the configuration

        TODO: we only support ONE keypoint structure per configuration right now....
        if you have multiple possible keypoint structures
        e.g. hands and also human poses, then we need to rework this to be more general

        Returns
        -------
        keypoints: list[str]
            the list of keypoint names
        skeleton: list[list[int]]
            the skeleton corresponding to the keypoint names
        '''
        if 'keypoints' not in self.config:
            return None, None
        return (
            [
                keypoint.name for keypoint in self.config.keypoints
            ],
            self.config.keypoint_skeleton
        )

    def get_keypoints(self, row: Sequence) -> list[int]:
        '''
        get keypoints from a "flat" row using expected indices in the row of keypoints 
        defined in self.config

        Parameters
        ----------
        row: Sequence
            a row, e.g. from a csv.  The keypoints should be in some columns of this row
            as defined in the configuration

        Returns
        -------
        keypoints: list[int]
            keypoint locations in form [x1,y1,v1,x2,y2,v2,....] where x,y are the location and v is the
            "visibility" according to the COCO docs
        '''
        if 'keypoints' not in self.config:
            return None
        return sum(
            [
                [
                    int(float(row[keypoint.x])),
                    int(float(row[int(keypoint.y)])),
                    IterableKeypointParser.FULLY_VISIBLE_COCO_KEYPOINT if 'visibility' not in keypoint else int(float(row[keypoint.visibility]))
                ]
                for keypoint in self.config.keypoints
            ],
            []
        )

get_keypoints(row)

get keypoints from a "flat" row using expected indices in the row of keypoints defined in self.config

Parameters:

Name Type Description Default
row Sequence

a row, e.g. from a csv. The keypoints should be in some columns of this row as defined in the configuration

required

Returns:

Name Type Description
keypoints list[int]

keypoint locations in form [x1,y1,v1,x2,y2,v2,....] where x,y are the location and v is the "visibility" according to the COCO docs

Source code in pycocowriter/csv2coco.py
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
def get_keypoints(self, row: Sequence) -> list[int]:
    '''
    get keypoints from a "flat" row using expected indices in the row of keypoints 
    defined in self.config

    Parameters
    ----------
    row: Sequence
        a row, e.g. from a csv.  The keypoints should be in some columns of this row
        as defined in the configuration

    Returns
    -------
    keypoints: list[int]
        keypoint locations in form [x1,y1,v1,x2,y2,v2,....] where x,y are the location and v is the
        "visibility" according to the COCO docs
    '''
    if 'keypoints' not in self.config:
        return None
    return sum(
        [
            [
                int(float(row[keypoint.x])),
                int(float(row[int(keypoint.y)])),
                IterableKeypointParser.FULLY_VISIBLE_COCO_KEYPOINT if 'visibility' not in keypoint else int(float(row[keypoint.visibility]))
            ]
            for keypoint in self.config.keypoints
        ],
        []
    )

keypoint_config()

get the keypoint layout from the configuration file. the coco keypoint layout is ['kpname1', 'kpname2', ...] and also a skeleton [edge1, edge2, ...] where edges are two-tuples of 1-indexed indexes of keypoints. For example, if keypoints are: ['hip', 'knee', 'ankle'], the skeleton would be: [[1,2],[2,3]] because the hip has an edge with the knee, and the knee has an edge to the ankle.

both of these items should be defined in the configuration

TODO: we only support ONE keypoint structure per configuration right now.... if you have multiple possible keypoint structures e.g. hands and also human poses, then we need to rework this to be more general

Returns:

Name Type Description
keypoints list[str]

the list of keypoint names

skeleton list[list[int]]

the skeleton corresponding to the keypoint names

Source code in pycocowriter/csv2coco.py
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
def keypoint_config(self) -> tuple[list[str], list[list[int]]]:
    '''
    get the keypoint layout from the configuration file.  the coco keypoint layout is
    ['kpname1', 'kpname2', ...]
    and also a skeleton
    [edge1, edge2, ...]
    where edges are two-tuples of 1-indexed indexes of keypoints.  For example, if keypoints are:
    ['hip', 'knee', 'ankle'],
    the skeleton would be:
    [[1,2],[2,3]] because the hip has an edge with the knee, and the knee has an edge to the ankle.

    both of these items should be defined in the configuration

    TODO: we only support ONE keypoint structure per configuration right now....
    if you have multiple possible keypoint structures
    e.g. hands and also human poses, then we need to rework this to be more general

    Returns
    -------
    keypoints: list[str]
        the list of keypoint names
    skeleton: list[list[int]]
        the skeleton corresponding to the keypoint names
    '''
    if 'keypoints' not in self.config:
        return None, None
    return (
        [
            keypoint.name for keypoint in self.config.keypoints
        ],
        self.config.keypoint_skeleton
    )

bbox_tlbr2xywh(bbox)

Convert a bounding box in "top left, bottom right" format to a bounding box in "top left, width height" format

Parameters:

Name Type Description Default
bbox tuple[int, int, int, int]

a four-tuple of [top_left_x, top_left_y, bottom_right_x, bottom_right_y]

required

Returns:

Name Type Description
bbox tuple[int, int, int, int]

a four-tuple of [top_left_x, top_left_y, width, height]

Source code in pycocowriter/csv2coco.py
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def bbox_tlbr2xywh(bbox: tuple[int, int, int, int]) -> tuple[int, int, int, int]:
    '''
    Convert a bounding box in "top left, bottom right" format 
    to a bounding box in "top left, width height" format

    Parameters
    ----------
    bbox: tuple[int,int,int,int]
        a four-tuple of [top_left_x, top_left_y, bottom_right_x, bottom_right_y]

    Returns
    -------
    bbox: tuple[int,int,int,int]    
        a four-tuple of [top_left_x, top_left_y, width, height]
    '''
    return (bbox[0], bbox[1], bbox[2] - bbox[0], bbox[3] - bbox[1])

parse_csv(config, filename)

Helper method to open a csv file and pass it row-by-row into the COCO builder

Parameters:

Name Type Description Default
config dict

a dictionary conforming to the Iterable2COCOConfig.SCHEMA

required
filename str

a csv file to be read and converted to COCO

required

Returns:

Name Type Description
images list[COCOImage]

a list of COCOImage reflecting the images from the csv file

annotations list[COCOAnnotation]

a list of COCOAnnotation reflecting the annotations from the csv file

categories list[COCOCategory]

a list of COCOCategory reflecting the categories of the annotations

Source code in pycocowriter/csv2coco.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def parse_csv(config: dict, filename: str) -> tuple[
    list[coco.COCOImage], list[coco.COCOAnnotation], list[coco.COCOCategory]]:
    """Helper method to open a csv file and pass it row-by-row into the COCO builder

    Parameters
    ----------
    config : dict
        a dictionary conforming to the Iterable2COCOConfig.SCHEMA
    filename : str
        a csv file to be read and converted to COCO

    Returns
    -------
    images : list[COCOImage]
        a list of COCOImage reflecting the images from the csv file
    annotations : list[COCOAnnotation]
        a list of COCOAnnotation reflecting the annotations from the csv file
    categories : list[COCOCategory]
        a list of COCOCategory reflecting the categories of the annotations
    """
    csv2coco = Iterable2COCO(Iterable2COCOConfig(config))
    with open(filename) as f:
        reader = csv.reader(f)
        images, annotations, categories = csv2coco.parse(reader)
    return images, annotations, categories