Python reads JSON files with comments

Python reads JSON files with comments


When reading the json file, sometimes when there is a comment in the file, it will report

Expecting property name: line 12 column 3 (char 268)

This means that there are characters in 12 columns and 3 lines of the file that do not conform to JSON format, that is, comments.

In order to parse the JSON file, the comments in the file must be removed. In node.js, there is a third-party package strip JSON comments that specifically removes comments. Unfortunately, there is no such package in python.

Found a code on the Internet

import json
import re
# Regular expression for comments
comment_re = re.compile(
    '(^)?[^\S\n]*/(?:\*(.*?)\*/[^\S\n]*|/[^\n]*)($)?',
    re.DOTALL | re.MULTILINE
)
def parse_json(filename):
    """ Parse a JSON file
        First remove comments and then use the json module package
        Comments look like :
            // ...
        or
            /*
            ...
            */
    """
    with open(filename) as f:
        content = ''.join(f.readlines())
        ## Looking for comments
        match = comment_re.search(content)
        while match:
            # single line comment
            content = content[:match.start()] + content[match.end():]
            match = comment_re.search(content)

        print content
        # Return json file
        return json.loads(content)

Can be removed as

// ....
/*
....
*/

When removing the first annotation, there may be some mischief, such as a key value team in the JSON file

"url": "http://127.0.0.1:16666",

http: the following / / 127.0.0.1:16666 "will be removed. It seems that you have to write your own

# Read json file with comments / / / * * /
    def parse_json(self,filename):
        """ Parse a JSON file
            First remove comments and then use the json module package
            Comments look like :
                // ...
            or
                /*
                ...
                */
        """
        res = []
        f = open(filename)
        all_lines = f.readlines()
        #Remove comments like / / but not including http:// ip_addr
        for line in all_lines:
            l = self.strip_comment(line)
            res.append(l)
        result = []
        comment = False
        #Remove comments like / * * /
        for l in res:
            if l.find("/*") != -1:
                comment = True
            if not comment:
                result.append(l)
            if l.find("*/") != -1:
                comment = False
        #If you directly use json.loads(str(res)), it will report "ValueError: No JSON object could be decoded"
        str_res = ""
        for i in result:
            str_res += i
        return json.loads(str_res)
    def strip_comment(self,line):
        #Regular expression matching IP address
        ip_re = re.compile('[0-9]+(?:\.[0-9]+){0,3}')
        index = line.find("//")
        if index == -1 :
            return line
        line_str = line[index + ]
        if ip_re.search(line_str):
            return line[:index+16] + self.strip_comment(line[index+17:])
        else:
            return line[:index] + self.strip_comment(line_str)

This code solves the above problems.

Tags: JSON Python

Posted on Sun, 03 May 2020 05:06:52 -0700 by PRSBOY