Quantcast
Channel: Ayende @ Rahien
Browsing all 166 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Looking into Corax’s posting lists: Part III

We looked into the internal of Corax’s posting list and in the last post I mentioned that we have a problem with the Baseline of the page.We are by no means the first people to work with posting lists,...

View Article


Fight for every byte it takes: Storing raw numbers

I write databases for a living, which means that I’m thinking a lot about persistence. Here is a fun challenge that we went through recently. We have the need to store a list of keys and values and...

View Article


Fight for every byte it takes: Variable size data

In my previous post, we stored keys and values as raw numbers inside the 8KB page. That was simple, but wasteful. For many scenarios, we are never going to need to utilize the full 8 bytes range for a...

View Article

Fight for every byte it takes: Nibbling at the costs

In my last post we implemented variable-sized encoding to be able to pack even more data into the page. We were able to achieve 40% better density because of that. This is pretty awesome, but we would...

View Article

Image may be NSFW.
Clik here to view.

Fight for every byte it takes: Fitting 64 values in 4 bits

Moving to nibble encoding gave us a measurable improvement in the density of the entries in the page.   The problem is that we pretty much run out of room to do so. We are currently using a byte per...

View Article


Fight for every byte it takes: Optimizing the encoding process

In my previous post, I showed how we use the nibble offload approach to store the size of entries in space that would otherwise be unused. My goal in that post was clarity, so I tried to make sure that...

View Article

Image may be NSFW.
Clik here to view.

Fight for every byte it takes: Decoding the entries

In this series so far, we reduced the storage cost of key/value lookups by a lot. And in the last post we optimized the process of encoding the keys and values significantly. This is great, but the...

View Article

Integer compression: delta encoding + variable size integers

If you are building a database, the need to work with a list of numbers is commonplace. For example, when building an index, we may want to store all the ids of documents of orders from Europe.You can...

View Article


Image may be NSFW.
Clik here to view.

Integer compression: Understanding Simd Compression by Lemire

In the previous post, I showed how you can use integer compression using variable-size integers. That is a really straightforward approach for integer compression, but it isn’t ideal. To start with, it...

View Article


Image may be NSFW.
Clik here to view.

Integer compression: Using SIMD bit packing in practice

In the last post, I talked about how the simdcomp library is actually just doing bit-packing. Given a set of numbers, it will put the first N bits in a packed format. That is all. On its own, that...

View Article

Integer compression: SIMD bit packing and unusual usages

I talked a bit before about the nature of bit packing and how the simdcomp library isn’t actually doing compression. Why do I care about that, then?Because the simdcomp library provides a very useful...

View Article

Image may be NSFW.
Clik here to view.

Integer compression: Understanding FastPFor

The FastPFor is an integer compression algorithm that was published in 2012 initially. You can read the paper about it here: Decoding billions of integers per second through vectorization.I’ve run into...

View Article

Integer compression: The FastPFor code

As I mentioned, I spent quite a lot of time trying to understand the mechanism behind how the FastPFor algorithm works. A large part of that was the fact that I didn’t initially distinguish the...

View Article


Image may be NSFW.
Clik here to view.

Integer compression: Porting simdcomp to C#

In the code of the simdcomp library there is a 25KLOC file that does evil things to SIMD registers to get bit packing to work. When I looked at the code the first few dozen times, I had a strong desire...

View Article

Image may be NSFW.
Clik here to view.

Integer compression: Adapting FastPFor to RavenDB

In this series so far, I explored several ways that we can implement integer compression. I focused on the FastPFor algorithm and dove deeply into how it works. In the last post, I showed how we can...

View Article


Image may be NSFW.
Clik here to view.

Integer compression: Implementing FastPFor encoding in C#

In the previous post I outlined the requirements we have for FastPFor in RavenDB. Now I want to dig into the actual implementation. Here is the shape of the class in question:The process starts when we...

View Article

Image may be NSFW.
Clik here to view.

Integer compression: Implementing FastPFor decoding in C#

In the previous post, I discussed FastPFor encoding, now I’m going to focus on how we deal with decoding. Here is the decode struct:Note that this is a struct for performance reasons. We expect that...

View Article


Integer compression: FastPFor in C#, results

After this saga, I wanted to close the series with some numbers about the impact of this algorithm.If you’ll recall, I started this whole series discussing variable-sized integers. I was using this...

View Article

Generating sequential numbers in a distributed manner

On its face, we have a simple requirement:Generate sequential numbersEnsure that there can be no gapsDo that in a distributed mannerGenerating the next number in the sequence is literally as simple as...

View Article

Production postmortem: ENOMEM when trying to free memory

We got a support call from a client, in the early hours of the morning, they were getting out of memory errors from their database and were understandably perturb by that. They are running on a cloud...

View Article
Browsing all 166 articles
Browse latest View live