• C++ Programming for Financial Engineering
    Highly recommended by thousands of MFE students. Covers essential C++ topics with applications to financial engineering. Learn more Join!
    Python for Finance with Intro to Data Science
    Gain practical understanding of Python to read, understand, and write professional Python code for your first day on the job. Learn more Join!
    An Intuition-Based Options Primer for FE
    Ideal for entry level positions interviews and graduate studies, specializing in options trading arbitrage and options valuation models. Learn more Join!

Does C++ has a data structure similar to Python Pandas' DataFrame?

Joined
6/4/22
Messages
40
Points
118
Is there an existing C++ library that can achieve the similar call-by-name functionalities as Python Pandas' DataFrame?

The particular functionality I want to achieve from this data structure is following:
I am able create a table of data, with column names and row indices. And then, I can call a certain cell's data like: table['ColName'][RowIndex].

Example:
table:
Name Sex Age
0 Tom M 20
1 Mary F 23
2 Jerry M 19

Then I can get:
table['Name'][1] = 'Mary'
table['Age'][2] = 19.

Thank you.
 
Standard C++ not, but it seems you are interested in heterogeneous data types (tuples!), Pandas is jut a whacky name for tuples aka structured or tabular data, yes? If yes, then Boost C++ might come to the rescue



As sidekick Boost C++ Bimap might also come in useful. See www.boost.org

And of course std::tuple in C++11


C++:
// TestFusion101.cpp
//
// Examples from Boost Fusion library.
//
// (C) Datasim Education BV 2016-2018
//

#include <vector>
#include <string>
#include <iostream>
#include <algorithm>                                                                                                                   
#include <boost/fusion/sequence.hpp>
#include <boost/fusion/include/sequence.hpp>
#include <boost/fusion/include/algorithm.hpp>


namespace fusion = boost::fusion;

struct Print
{
template <typename T>
    void operator()(T const& x) const
    {
        std::cout << "Name: " << typeid(x).name() << '\n';
    }
};


int main()
{
    fusion::vector<int, char, std::string> tuple(1, 'x', std::string("101"));
    int i = fusion::at_c<0>(tuple);
    char ch = fusion::at_c<1>(tuple);
    std::string s = fusion::at_c<2>(tuple);
  
    fusion::for_each(tuple, Print());

    return 0;
}
 
Last edited:
I'm pretty sure that it is all doable. Also, Boost Serialization might be useful as a simple DATA STORE.


A nice Adv C++ project if you have time to spare.

@APalley
 
Last edited:
Hi Professor, Thanks a lot for the idea!
Indeed the Bimap with Boost Multiindex looks like could achieve the SQL-style searching syntax.
I will look into it!
AFAIR I do Bimap in my Adv C++ course.??

@APalley
C++:
// TestDNS.cpp
//
// Simple DNS lookup with IP addressing and domain names.
//
// (C) Datasim Education BV 2014
//

#include <boost/config.hpp>

#include <iostream>
#include <string>
#include <boost/bimap/bimap.hpp>
#include <boost/bimap/list_of.hpp>
#include <boost/bimap/unordered_set_of.hpp>


// UUID addresses
#include <boost/uuid/uuid.hpp>
#include <boost/uuid/uuid_generators.hpp>
#include <boost/uuid/uuid_io.hpp>

// Tags for better readability
struct IpAddress {};
struct DomainName {};

int main()
{
    using namespace boost::bimaps;

    typedef bimap
    <
        unordered_set_of< tagged< boost::uuids::uuid, DomainName > >,
        unordered_set_of< tagged< std::string, IpAddress > >,
        list_of_relation

    > DNS;

    DNS dnsDB;

    // We have to use `push_back` because the collection of relations is
    // a `list_of_relation`

    using namespace boost::uuids;

    // Creating uuids. (See Boost II book page 68)
    // From strings.
    string_generator strGen;
    uuid u1 = strGen("00000000-0000-0000-0000-000000000000");
    uuid u2 = strGen("0123456789abcdef0123456789ABCDEF");

    dnsDB.push_back( DNS::value_type(u1, "www.hello.com"));
    dnsDB.push_back( DNS::value_type(u2,"www.secret.com"));
 
    std::cout << "Size of DNS DB; " << dnsDB.size() << std::endl;
    // Search the queried word on the from index (DomainName)

    DNS::map_by<DomainName>::const_iterator name = dnsDB.by<DomainName>().find(u1);

    if( name !=dnsDB.by<DomainName>().end() )
    {
        std::cout << u1 << " has dns name: " << name->get<IpAddress>() << std::endl;
    }
 
    for( DNS::const_iterator i = dnsDB.begin(), i_end = dnsDB.end(); i != i_end ; ++i )
    {
        std::cout << i->get<DomainName>() << " <---> " << i->get<IpAddress>() << std::endl;
    }
    
 
    
    return 0;
}
 
Last edited:
Hello guys
"No, C++ does not have a built-in data structure similar to Python Pandas' DataFrame. However, C++ offers libraries like Eigen, Armadillo, and Boost.MultiArray that can be used for similar functionality."freefire
 
Back
Top