C Arch patterns

There are some patterns you can see in C code.
Mostly signs that give off indications of what are the data structures used.

For example, this is a code from cpython source(Include/object.h):(obfuscated for illustrative purposes)

typedef struct XXX {
struct XXX *nnn;
const char* string;
PyObject *object;
} XXX;

The sign here is that the struct name (XXX here) is used, within the
struct definition for another element. In this case, this element is a pointer to the self. This indicates this is a linked list.

It can be a circular linked list or singly linked list with the last node pointing to NULL. That distinction can be guessed with initial values. Given it’s not being initialized, I would guess this is a circular linked list, but to verify I will have to check parts of the code where this struct is initialized.

On the contrary, if I had needed this to be a doubly linked list instead, I would have defined it this way instead.

typedef struct XXX {
struct XXX *ppp;
struct XXX *nnn;
const char* string;
} XXX;

The point being two pointers to the same structure, but this does not necessarily mean there’s a doubly linked list implementation. If someone’s trying to write Obfuscated C code, they might use this struct but use it differently, on purpose to hinder code readability.

Oh, by the way, this could also be made into a binary-tree structure by manipulating the insertions into *ppp, *nnn, as new structs. In case of making this a binary tree, we would need a check to make sure no new object linked to *nnn(and *ppp) is already in the structure. Infact it would be more apt naming those *left and *right than *ppp, *nnn.


typedef struct XXX {
struct XXX *left;
struct XXX *right;
const char* nodelabel;
} XXX;

Note purely in terms of the type of elements composing/constituting the struct, there’s no difference between this and the above doubly linked list. However, the difference will arise out of how these struct pointers are inserted.

Now, I’ll graduate to the next data structure, I believe is logically next, (and is one of my favourite) data structure — Graphs.

Linked lists, are simply a very specific case of a graph(which can be called a specific case of sets).
Linked lists are simply graphs restricted to only one edge,(directed) between any two nodes.
Ofcourse, am assuming planar graph<link>,(i.e: one edge can only connect two nodes) and not hyper-graph<link>(think nodes in 3D, and edges that can connect more than “2 nodes”.)

So, Opencog<link> is a open-source attempt to create an Artificial General Intelligence system.
At the core of it is a hypergraph<link to why hypergraphs>(aka atomspace).
Note: this is a C++ project.

Here’s the code for defining an atom:

class Atom
: public std::enable_shared_from_this<Atom>
friend class ::AtomUTest; // Needs to call setFlag()
friend class AtomTable; // Needs to call MarkedForRemoval()
friend class ImportanceIndex; // Needs to call setFlag()
friend class Handle; // Needs to view _uuid
friend class SavingLoading; // Needs to set _uuid
friend class TLB; // Needs to view _uuid

//! Sets the AtomTable in which this Atom is inserted.
void setAtomTable(AtomTable *);

//! Returns the AtomTable in which this Atom is inserted.
AtomTable *getAtomTable() const { return _atomTable; }

UUID _uuid;
AtomTable *_atomTable;

Type _type;
char _flags;

TruthValuePtr _truthValue;
AttentionValuePtr _attentionValue;

// Lock needed to serialize AV changes.
// Just right now, we will use a single shared mutex for all
// locking. If this causes too much contention, then we can
// fall back to a non-global lock, at the cost of 40 additional
// bytes per atom.
static std::mutex _avmtx;
static std::mutex _mtx;

* Constructor for this class.
* @param The type of the atom.
* @param Outgoing set of the atom, that is, the set of atoms this
* atom references. It must be an empty vector if the atom is a node.
* @param The truthValue of the atom. note: This is not cloned as
* in setTruthValue.
Atom(Type, TruthValuePtr = TruthValue::NULL_TV(),
AttentionValuePtr = AttentionValue::DEFAULT_AV());

struct InSet
// incoming set is not tracked by garbage collector,
// to avoid cyclic references.
// std::set<ptr> uses 48 bytes (per atom).
IncomingSet _iset;
// Some people want to know if the incoming set has changed...
AtomPairSignal _addAtomSignal;
AtomPairSignal _removeAtomSignal;
typedef std::shared_ptr<InSet> InSetPtr;
InSetPtr _incoming_set;
void keep_incoming_set();
void drop_incoming_set();

// Insert and remove links from the incoming set.
void insert_atom(LinkPtr);
void remove_atom(LinkPtr);

/** Returns whether this atom is marked for removal.
* @return Whether this atom is marked for removal.
bool isMarkedForRemoval() const;

/** Returns an atom flag.
* A byte represents all flags. Each bit is one of them.
* @param An int indicating which of the flags will be returned.
* @return A boolean indicating if that flag is set or not.
bool getFlag(int) const;

/** Changes the value of the given flag.
* @param An int indicating which of the flags will be set.
* @param A boolean indicating the new value of the flag.
void setFlag(int, bool);

//! Marks the atom for removal.
void markForRemoval();

//! Unsets removal flag.
void unsetRemovalFlag();


virtual ~Atom();

/** Returns the type of the atom.
* @return The type of the atom.
inline Type getType() const { return _type; }

/** Returns the handle of the atom.
* @return The handle of the atom.
inline Handle getHandle() {
return Handle(shared_from_this());

/** Returns the AttentionValue object of the atom.
* @return The const reference to the AttentionValue object
* of the atom.
AttentionValuePtr getAttentionValue() const { return _attentionValue; }

//! Sets the AttentionValue object of the atom.
void setAttentionValue(AttentionValuePtr) throw (RuntimeException);

/** Returns the TruthValue object of the atom.
* @return The const referent to the TruthValue object of the atom.
TruthValuePtr getTruthValue() const { return _truthValue; }

//! Sets the TruthValue object of the atom.
void setTruthValue(TruthValuePtr);
void setTruthValue(CompositeTruthValuePtr ctv) {

//! Return the incoming set of this atom.
IncomingSet getIncomingSet() const;

/** Returns a string representation of the node.
* @return A string representation of the node.
virtual std::string toString(std::string indent = "") const = 0;
virtual std::string toShortString(std::string indent = "") const = 0;

/** Returns whether two atoms are equal.
* @return true if the atoms are equal, false otherwise.
virtual bool operator==(const Atom&) const = 0;

/** Returns whether two atoms are different.
* @return true if the atoms are different, false otherwise.
virtual bool operator!=(const Atom&) const = 0;

Whoa, C++, reputation of being verbose and fearsome is justified, i think. Now let’s break this down.
Let’s now check one of my favourite project redis, And a data structure/algorithm that fascinates me. The hyperloglog, HLL

struct hllhdr {
char magic[4]; /* "HYLL" */
uint8_t encoding; /* HLL_DENSE or HLL_SPARSE. */
uint8_t notused[3]; /* Reserved for future use, must be zero. */
uint8_t card[8]; /* Cached cardinality, little endian. */
uint8_t registers[]; /* Data bytes. */

Now this looks simple, and some of you might think it’s deceptively simple.(Don’t it’s straight forward simple). The real magic is actually in how these values are computed. There’s a reason HLL is called an algorithm and not considered a data structure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.