This post starts a series of articles that give insight to the new
features of Jansson 2.0.
First up is
the json_unpack()
API. I think it's the most powerful new feature, allowing the user
to perform two things on a JSON value: data extraction, and
validation against a simple schema. The idea has been stolen from
Python's C API.
Example:
/* Assume that obj is the following JSON object:
* {"x": 15.4, "y": 99.8, "z": 42}}
*/
json_t *obj;
double x, y, z;
if(json_unpack(obj, "{s:f, s:f, s:f}", "x", &x, "y", &y, "z", &z))
return -1; /* error */
assert(x == 15.4 && y == 99.8 && z == 42);
The format string passed to json_unpack()
describes the structure of the object. The s
format
denotes an object key, and the f
format means a real
number value. Whitespace, :
and ,
are
ignored, so {sfsfsf}
would be an equivalent format
string to the one above.
After the format string, there's one argument for each
format character. For object keys, a string specifies what key is
accessed, and for real numbers, a pointer to double gives an address
where to store the value.
The equivalent code without json_unpack()
would be
something like this:
json_t *obj, *tmp;
double x, y, x;
tmp = json_object_get(obj, "x");
if(!json_is_real(tmp))
return -1; /* error */
x = json_real_value(tmp);
/* repeat for y and z */
/* ... */
printf("x: %f, y: %f, z: %f\n", x, y, z);
/* ==> x: 15.4, y: 99.8, z: 42 */
The code that uses json_unpack()
is much shorter and
cleaner, and it's easier to see what it's doing.
Nested values are supported, too:
/* Assume that nested is the following JSON object:
* {"foo": {"bar": [11, 12, 13]}}
*/
json_t *nested;
int i1, i2, i3;
if(json_unpack(nested, "{s:{s:[iii]}}", "foo", "bar", &i1, &i2, &i3))
return -1; /* error */
assert(i1 == 11 && i2 == 12 && i3 == 13);
This time, the format string has two nested objects and a nested
array. There's no limit on the nesting levels. The variable
arguments are used in the "flat" order in which they appear in the
format string.
The same API can also be used in a validation-only mode, i.e.
without extracting any values. Error messages are also available:
/* Assume the same JSON object as in the previous example */
json_t *nested;
json_error_t error;
if(json_unpack_ex(nested, &error, JSON_VALIDATE_ONLY,
"{s:{s:[iii]}}", "foo", "bar"))
{
fprintf(stderr, "Error: %d:%d: %s\n", error.line, error.column, error.text);
return -1;
}
The json_unpack_ex()
function is the extended version
of json_unpack()
. It takes an error parameter, similar
to decoding functions, and optional flags to control the behaviour.
The JSON_VALIDATE_ONLY
flags tells it to only validate
and not to extract anything. Extra arguments after the format sting
are only required for object keys. The available validation is quite
simple, only the object/array structure and value types can be
checked, but usually this saves a lot of code.
I strongly believe that this feature, along with
the json_pack()
API described in the next part, will
make it an order of magnitude more pleasant to manipulate JSON data
in C. Many thanks to Graeme Smecher for suggesting
this and providing the initial implementation.
This article only gave a few examples. For full details, all
available format characters and flags, see
the documentation.
In a work project, I have a few JavaScript files that are generated
from a bunch of other files. The project is
a Django website, so I
just have views that generate the files on-the-fly when running in
debug mode, and everything works nice and smooth.
For production, though, I needed the flat files that would be served
from disk. I figured out that the best approach would be to generate
the files in setup.py
upon installation, but I could
only
find very
superficial documentation on how to do that.
A brief intro to setup.py
: In every
project's setup.py
file, the setup
function, from Python standard library's distutils.core
module, is used to define the project's files and metadata.
(setup
can also be imported from
from setuptools
or distribute,
but they're compatible with distutils.) With the standard commands
that setup.py
provides, the files and metadata can be
compiled to an egg, distributed as a source tarball, uploaded
to PyPI, and so on.
The entry point to altering setup.py
's behaviour is the
optional cmdclass
argument to the setup
function. It's value is a dict
from command names
to distutils.command.Command
subclasses that implement
the commands. The build_py
command is where the package
data files are installed, so to override build_py
, I
created the class my_build_py
and registered it, like
this:
from distutils.core import setup
from distutils.command.build_py import build_py
class my_build_py(build_py):
# ...
setup(
# Define metadata, files, etc.
# ...
cmdclass={'build_py': my_build_py}
)
The run
method of build_py
, along with
copying and compiling the Python source files, is responsible for
copying the packages data files to the build directory
build/lib.<platform>
. (The actual directory name
is stored in the build_py
instance's self.build_lib
variable.)
To install your own files, just override the run
method. Remember to call the superclass after you're done with your
own files.
def generate_content():
# generate the file content...
return content
class my_build_py(build_py):
def run(self):
# honor the --dry-run flag
if not self.dry_run:
target_dir = os.path.join(self.build_lib, 'mypkg/media')
# mkpath is a distutils helper to create directories
self.mkpath(target_dir)
with open(os.path.join(target_dir, 'myfile.js'), 'w'):
fobj.write(generate_content())
# distutils uses old-style classes, so no super()
build_py.run(self)
And that's it! A later phase of the installation copies everything
from build/lib.<platform>
to the correct place,
so your generated file gets in, too.
For some time now, I've been unsatisfied with the state of user
names and passwords for the numerous services I use in the web. I'm
having hard time to remember all the services I have singed up to
and a bad habit of using a few common passwords for all of them.
So, I hacked for a while, and yestreday, I pushed the first release
of sala to PyPI. It's
is a simple, filesystem based, encrypted password storage system
that uses GnuPG's symmetrical
encryption. As usual, a git repository
is available at
GitHub.
The main idea of sala is to store passwords (or other tiny,
plain-text secrets) in encrypted plain-text files in a directory
hierarchy, like this:
/path/to/passwords
|-- example-service.com
| |-- +webmail
| | |-- @myuser
| | `-- @otheruser
| `-- +adminpanel
| `-- @admin
`-- my-linux-box
|-- @myuser
`-- @root
As sala is a command line utility and there's one file per password,
tab completion and other shell goodies are available. The custom of
prefixing user names with @
and
category/group/subservice names with +
is my own
preference and not enforced by the program. You may come up with
your own scheme, for example if you want to protect user names as
well as the actual passwords. For more information, see
the PyPI page.
Use pip install sala
to install,
or download the
source, unpack, and invoke python setup.py install
.
In addition to Python 2.5 or newer, requires gpg
and GnuPGInterface.
Currently, Python 3 is not supported because GnuPGInterface doesn't
support it.
Update: While writing this blog entry, I found a few
packaging bugs, and released version 1.0.1.