Google I/O 2009 - ..Scalable, Complex Apps on App Engine

Slatkin: Welcome to my talk.

This is "Building Scalable, Complex Apps on App Engine."

I am Brett Slatkin.

I am a software engineer on the Google App Engine team.

And...yeah. So let's just jump right in.

So the agenda-- here's the agenda.

So we're gonna talk about two subjects today

that are, I think, really interesting

for building App Engine apps.

The first one is list properties.

We're gonna talk about what they are,

how they work.

I'm gonna give you a concrete example

of microblogging

to illustrate why they're useful.

And then we're gonna talk about maximizing the performance

of list properties,

with some detail in there.

The second topic is merge-join.

We're gonna talk about what it is,

how it works, and list property magic--

some really cool things you can do with App Engine

that people don't realize is actually possible.

And the concrete example there that I'm going to show

is modeling the social graph.

And again, there's some other detail.

I didn't get hooked up on the Moderator thing

in time for this, but I have just set up my own here--

tinyurl.com/complextalk.

So please hop on, put in some questions,

if you're confused by anything,

vote up on anything,

hopefully that'll work.

All right, so let's-- let's go in here.

So I guess the one little thing I wanted to say

before all this was...

it's been a really exciting year

on App Engine, on the App Engine Team--

seeing all the developers do all these things,

seeing our developers wrap their heads around

all the crazy things that we ask you to do

with our datastore.

I am gonna make this even harder for you

to understand, maybe.

I'm gonna just throw you into the fire a little bit

because people seem to like that.

So if I'm going too fast or it's a little complicated,

please ask me a question--

especially on Moderator--

but hopefully it'll make sense.

But, you know, if it doesn't make sense,

hopefully I'll explain it in a later slide

and so, yeah, just bear with me

is what I'm trying to say.

Okay, so what is a list property?

So it's just another property in the datastore.

It has multiple values.

So it's an ordered list that maintains its order.

So when you write the entity or you retrieve the entity,

it comes back in the same order--

this property comes back as a list

in the same order that you wrote it.

Now, when you query, it uses an equals filter.

But the interesting thing is that when you do that query,

it matches any of the possible values in that list.

So the entity will match for any individual item

in that list.

And...and one of the things that's also funny

is that--because of the way that App Engine works,

the default sort order on list properties

is not very useful.

It's kind of complicated, and we'll get into why.

So you have to sometimes use composite indexes

if you want some distinct ordering

with list properties.

You'll understand what I mean in a minute.

But they're very easy to use.

I mean, here's an example right here.

I have a "Favorites" model class.

I have a list of colors,

and then I have the user that those favorites belong to.

If I want to assign a list property,

I just--it's a list; I just assign it.

That's it. That's all there is to it.

Now, why would you want to do this, you know?

Like, this is a many-to-one relationship

that I am modeling.

Why would you do this as opposed to

inserting a new entry in the database

for each--for each part of the one-to-many relationship?

Well, there's a few reasons.

First, you can densely pack information.

You can track lists of related items in a way

that's much more efficient in terms of space and CPU.

You could also use multiple lists in parallel

to store "tuples."

So two lists in parallel would be a pair.

Three would be a triple... so on.

It's also just kind of easier to use.

If you look at this example,

this would be a standard one-to-many relationship

in the datastore.

Here again I have the-- a single entity

representing a favorite color.

And I would write multiple of these for each user.

And then to find the--a list of somebody's favorite colors,

I would actually have to do a query

and retrieve the data.

So, you know, that's a lot more complicated

than just saying .colors.

But, more importantly, list properties are great

for answering set membership questions.

And this is where the queries come in.

So which colors-- like, here,

an example is which color is like--

which users like the color yellow?

And so here is an example.

I say, "Give me all favorite colors

where the favorite color is yellow."

It gives me back the results,

and then I can pull out the usernames.

It's a simple query to give me

some pretty interesting information

very easily.

Now, this seems really simple,

but this actually has a great fan-out capability.

I can--I can cut across all of my data.

I can match kind of any-- any value of yellow

in users' lists of favorite colors

across all FavoriteColor entities.

So we'll get more into what "fan-out" means later.

But there's--there's some really useful things in here.

Now, why would you use list properties,

again, over the normal one-to-many relationship?

Well, first of all, each list item

only has an index entry.

So normally when you take a-- the App Engine datastore

and you write an entity,

you're writing into multiple indexes.

You write the entity itself into the entities table

and you write into the "by kind" index.

And you write into each of the different property indexes.

And you write into composite indexes.

And so there's a bunch of overhead

with actually storing that entity.

And with a list item,

you only actually add one index row.

That's it.

So--so where, you know, a normal entity, right,

would have, say, five, this would only have one.

So less storage.

You also save the space of the keys

of that entity in that index,

which also saves you more space.

And so, ultimately, you're just saving space.

And it's easier to understand.

It's just a list.

There are a couple of gotchas, though.

It doesn't come for free, you know.

It's all about trade-offs.

So, you know, list properties use more CPU

for serializing and deserializing

than--than doing it the other way.

When you access the entity,

you have to serialize and deserialize the whole list.

We'll get into what that means and why that's bad.

The other thing is, like I said,

the sort orders aren't very useful on their own.

So if you-- if you want to use

list properties and sort orders,

you need to use a composite index.

And you need to be careful that

you only use one list property

in a composite index

or else you get this "exploding" problem.

What is exploding and what is an exploding index?

So here's the idea:

You have a list property with 1,000 entries in it.

And then you have another list property

with 1,000 entries in it.

If you want to build a composite index on both,

you need the--the cross product of all of them.

So you actually need 1 million index entries

to represent all the possible combinations

of how those could pair up.

And that's-- that's the explosion

in storage space.

So watch out for that.

But there's a lot of stuff you can do that's useful

with creating on just one list property.

And I'll explain.

So let's get into our concrete example.

I feel like every conference I've seen

in the last two years

has somebody explaining

how to make your favorite microblogging service scale.

For real, this time, on my platform,

I--I swear to you--

I guess I'm guilty of that.

But microblogging is actually a very interesting example.

So microblogging--

I mean stuff like Jaiku, Twitter, identi.ca,

some of the other ones out there.

And it's basically a very high speed feed

of small messages.

It represents the publish and subscribe use case.

It's also broadcast, multicast--

kind of the same thing.

The general idea is the user sends a message

and it goes to many other users...

a lot of users.

In some cases, millions of them.

And it's a great example of fan-out,

because one simple user action--

typing in your 140 characters, pushing return--

causes a lot of work to happen,

causes a lot of data that needs to be surfaced,

a lot of people that need to be notified,

and it's so easy to do

that you're constantly trying to catch up with this work.

That's why it's hard to scale this.

And that's just the nature of fan-out.

It's hard.

So let's talk about an example of fan-out,

kind of the naive way of doing it.

Now, fan-out is hard because it can be inefficient.

You can require duplicate data.

And so this is kind of the email example.

If you imagine that each of these users--

"B," "C," and "D"-- are different email users

on different email servers,

when SMTP wants to send a message,

what does it do?

Well, it sends a copy of the message

to each SMTP server.

So that's copying all the bandwidth, all the data.

Each SMTP server needs to manage it differently,

and all the users get their own copy.

And eventually they download it to their own machine,

and there's just a lot of duplicate work happening.

Now, this is kind of a testament

to why SMTP works so well.

It's scaled just fine.

But the throughput's kind of lower in some ways.

And the fan-out's usually pretty low.

People don't usually send emails to a million people.

They send it to 100 or a listserv.

A better way to do fan-out is like this.

You know, not duplicate any data.

So ideally we'd kind of do--

do message sending by reference.

We'd have user "A" send a message.

The message would be-- have all of the different...

users who should get it annotated on it,

And then they would just receive it by reference.

We would be able to figure out, "Oh, here are the messages

that user 'B' should receive."

So let's talk about how to do that model

using a traditional relational database.

This is probably what it would look like.

There's probably a better way to do it,

but this is the first thing that I think would come to mind.

You have a table of users.

You have a table of messages.

Then you have the table of messages that users should get.

So each user has an I.D. and some properties--

their name, et cetera.

You have a table of the messages,

the body of the message, title,

time it was sent, et cetera.

And then you have a-- the relationship

between these two primary I.D.s,

which is the join table.

So how would you do this in SQL?

Well, if you want to find messages for the user "X,"

you would do a join.

You'd select from messages all the user messages--

select from messages all messages

with a "join" on the UserMessages table

using message_id to join the two tables together.

So that's basically connecting the dots.

And then you would say,

"Now only give me the ones for user 'X.'"

And that's how you get that list.

But there are no joins on App Engine.

So how do you do this?

How would you implement this on App Engine?

The answer is list properties.

They--they come to our rescue here.

So with App Engine,

I define a simple message class.

This is a-- just a DB model.

I'm doing this in Python just 'cause it's a little shorter.

I say, "Here's my sender."

I have a list of receivers.

These are the people that should get my messages

by reference.

Then I have a body, which is the actual body of the message.

Now, to figure out what messages I get,

I just do a simple query.

I say, "Give me all the messages where I'm a receiver."

Seems pretty natural. There are no joins.

It's just--it's just done.

And this is essentially how Jaiku works

at its core--

this simple idea of doing fan-out

with list properties.

Now, with JDO you can do the same thing.

Here, again,

you have the autogenerated I.D. of the message.

You have a sender, you have the body,

and you have a list of receivers.

Exactly the same idea.

To do the query-- I'm sorry.

Everyone get it?

All right.

To do the query you do exactly the same thing.

It's a list property with an equals filter.

You get your PersistenceManager,

you create your query on the Message.class,

you filter on the receivers using an equals filter,

and then you get back a message.

And at that point, results.receivers

is the list of receivers that you want to--to access.

Okay, so let's have a-- let's have a go.

Let me show you the actual-- let me show you it in action.

So this is a little app I wrote

called "Publish-subscribe test."

The source code's available online.

I have a link to it later.

And this lets me demonstrate how this works.

So to make this simple, I just made it

so that you specify the number of receivers you want.

And it'll go and send that message

to that many receivers.

So here I can have my nickname: Brett.

I want to send it to, say, 100 receivers,

and say, "Hi from I/O."

Then I submit. Great. Message sent.

So now if I look at Message 1,

I can retrieve it.

And then I have a list of them here.

I think--I think I have too many examples in here.

I didn't do paging yet, but it's in there.

And I'll get to why I have all these--

all this demi-data in just a sec.

Right. So pretty simple.

Really easy. Really simple query.

Easy to do writes.

So let's talk about the performance of this, though.

So index writes are done in parallel on BigTable.

That means that it's fast. It's really fast.

You can update a list property with 1,000 items.

That'll actually do 1,000 different writes

to BigTable simultaneously in parallel,

which is-- that's a lot of throughput.

The write time complexity, cost CPU time, the latency--

it scales linearly with the number of items.

And right now we limit--

we limit you to 5,000 indexed properties per entity

kind of for safety.

And the storage cost is about the same

as a relational database system.

In a relational database,

you'd have that user messages table.

You'd have the user key and you'd have the message key

in the join table,

and that would be your cost of sending a message.

You'd have to write different rows for each person

that needs to receive the message.

In the datastore, it's exactly the same way.

You have the entity key

and then you have the list property value--

the person who should receive it.

And those are the only things that you need to store.

So it's just primary keys.

But there's a downside-- I kind of hinted at it--

Serialization.

So the problem is that

writes have to pack your list

into one serialized protocol buffer.

Protocol buffers are an internal tool--

it's now external, available for you.

It's a very quick serialization

marshaling kind of protocol.

And it's really fast, and it's space efficient.

But even so, if you have 1,000, 2,000 items,

that packaging can take a little while.

Now, for writes this is okay

because writes are relatively infrequent.

They're a lot less frequent than reads.

The killer here is that

queries also have to unpackage all the result entities.

So if you remember our message class--

I'm gonna zip back here.

Our message class has a list of receivers.

If you have 1,000 receivers in there,

that's gonna take a little while

if you receive just one.

Now, what if you want to take 100 messages?

Then you have to deserialize 100 times 1,000--

100,000 different list property values--

from your entities.

And that serialization cost gets pretty high.

So we've got a problem.

And it's-- it uses a lot of CPU.

It uses a lot of latency, datastore latency,

so end user latency is pretty high.

It's--it's-- it's just not very good.

And so now you're saying,

"Well, why are you telling me about all this stuff

if I can't use it," right?

Well, luckily, I have a solution.

But we'll get to it.

So really, when I query for messages,

I only really want the message information.

I don't really care about the index

that got me the information after the fact.

The problem with App Engine: a little--in a bit

is that, you know, you're carrying your indexes with you

because you're using a denormalized schema

a lot of the time.

The receivers list is a denormalized index.

And when you retrieve an entity,

you pull that whole index down also.

But we don't really care about the list property

after we query.

We, you know, we only care about the message body,

the sender, and metainformation.

So what if we could selectively skip certain properties

when we're querying?

So, for instance, the receivers list--

just don't give it back to me.

Then we could avoid the serialization cost.

Then we'd just get the message bodies that we care about.

And ideally we could do something like what SQL does,

where we can just say, "Hey," you know,

"give me only these columns of the table."

But you can't do that.

We don't have support for that yet.

Still trying to figure out how that works

with our whole all-around scheme in the DB module.

But we have a solution.

And that's something called the relation index entity.

This is kind of a new thing.

I don't--I don't know what the best name for it is,

but I'm calling it a relational index entity.

So the idea is really simple.

Let's split the message into two entities, okay?

The first one is the message model.

That contains what we care about.

It has the body, and the sender,

and the metainformation that we actually want to access

once we've done the query.

Then we have another one, which is the MessageIndex.

And the MessageIndex just has

the fields we want to query on.

Nothing else. So here they are.

You can see, I just split it in half.

That's all I did.

Now, the key thing

is that we put them in the same entity group

so that we can transactionally operate on both of them

at the same time.

And we make the MessageIndex a child of the message.

What that means is that the MessageIndex primary key

implicitly points to the-- to the messages--

to its parents' primary key.

In the App Engine datastore,

all primary keys include the full path

of--of that entity and its parent elements.

So...so in this case,

if we had just the MessageIndex,

I know who its parent is just by looking at the key.

All I need is the key.

So a new thing that we just launched--

I think it was in our 1.2.2 launch

just a couple of weeks ago--

was something called the key-only query.

And a key-only query lets me do...

a key-only fetch on the message index type.

What does that mean?

So I want to do a query and only get back the keys

that match.

Then I want to transform those keys

to de-reference the pointer to their parents, essentially.

And then I want to fetch the actual messages

that I care about in batch, very quickly.

And so here's an example of that.

I do a GQL query. I say,

"Select the key from the MessageIndex."

Just show me the matching entities.

Don't pull down the whole MessageIndex entity.

Don't--don't deserialize the list properties.

Just avoid it. Just give me the keys.

And then, you see, on the next line I say

keys = k.parent.

That traverses the key to give me the parent entity,

which is a message.

And then I say db.get(keys)

and that gets all of those keys in parallel.

Now, under the covers,

this is what App Engine's datastore is doing anyway.

This is--this is-- this is how we work.

We--we scan one index,

we figure out all the keys you need,

and then we grab them all from the actual entities table

in parallel.

Ryan Barrett's talk

on "Under the Covers of the App Engine Datastore"

gets into some-- from last year

gets into some really interesting detail

about how that works.

But hopefully you can see the efficiency

we're gonna get from this,

because we've completely avoided

all of the list property deserialization.

We can-- we can just figure out

what--what entities we need to get,

go and retrieve them,

and then-- and then pull down

the part of the message we actually care about.

So I--I have this written up also in that same demo.

And what's cool is I have a little check box

to do either-- either of the examples.

So I have a little timing box I can click on.

Then I click "Retrieve".

So you'll see this is retrieving, I think,

ten list property entities,

ten message entities.

Each one has 2,000 receivers in it.

And it takes 1.3 seconds.

That is a really long time. That is way too long.

1.6-- so it's got some variants.

It's--it's way too slow.

That's just for ten.

But if I change this to use the key-only ones--

again, I have ten entities in there--

oh, need timing on.

You see it's a lot faster.

It's--it's ten times faster.

It's--it's sometimes faster than ten times faster.

It's--it's way better.

And this is because I'm avoiding all the serialization costs

associated with serializing and deserializing

those list properties.

So what's the conclusion here?

Relation index entities

give us much better performance.

We have the same cost for doing writes

but reads are ten times faster and cheaper.

We have--ten times faster and cheaper

than the old way of doing it,

where we had to do deserialization.

And we have the best of both worlds with list properties.

We have very low storage costs

because we only write one index row

for each list value

and the low CPU cost

because the queries are really easy to do.

So this is a way of doing fan-out.

It's a very efficient way of doing fan-out.

Now, what if we want to keep going further?

What if we need more fan-out?

I said there was a limit of 5,000 entit--

5,000 index rows.

Well, in that case, if we need more indexes,

you can just write multiple index--

relation index entities per message.

There's nothing stopping you

from having multiple child entities

that refer back to their parent.

You can add these index entities in the background

using our new Task Queue API

which I'll talk about tomorrow.

And I think this provides a solution

to the million-fan-out problem.

I think that the million-fan-out problem

is kind of a really hard problem

that people have been talking and talking about

how to solve for a while.

And I think this'll let you do it.

I think the other thing that's cool about this

is that you don't have to migrate your schema to do it.

If you say, "Hey, I need an index on 'X'"

suddenly for this user,

just go through and write a new relation index

for each of your users as a child element.

That's all you have to do.

You don't have to change any of your schema.

They're just totally decoupled--

your indexes and your data.

So the final picture of what you'll have

is you'll have the message, which is what you care about,

and then just a series of indexes

written as child entities.

And you use those child entities as references

to the parent to find the parent,

but then don't-- you don't deserialize them ever.

And--and what you're doing here really

is kind of accessing

a lower-level BigTable-like interface

of doing row range and index scans.

So if you have a problem doing fan-out,

doing set membership fan-out,

I encourage you to look at list properties.

Okay.

This going okay so far?

People with me? Yeah?

I see some nods. That's good.

Okay, so merge-join.

Merge-join is a funny word.

Merge-join is-- is a--is the secret sauce

that I don't think people realize is there.

So what is it?

Well, people say, "Oh, App Engine,

"I'm not gonna use App Engine.

App Engine doesn't support joins."

Well, that's not true, actually.

We don't support natural joins

or inner joins or outer joins.

Those are useful joins. We don't support those.

But we do support merge-joins.

A merge-join is a self-join.

We can join tables with its--

we can join a table with itself.

And what that means is you can combine

multiple equality filters--

multiple equality tests in a single query.

And so what that lets you do

is determine Venn-diagram-like overlaps in sets.

So an example is something very useful

for your business application:

I want to see the overlap of spots, horns, and legs.

Obviously that's a cow-- four legs.

Now, if I want to do a query to figure out this data

in some taxonomy of animals,

merge-join is what you can use to do it.

I'll show you why.

But why would you want to use merge-join?

Well, it's great for exploring your data,

data mining-like operations.

There's--the practical limit of equality tests

is pretty high.

You can have ten or more filters pretty easily.

And you don't have to build indexes in advance

to do these queries.

So if you're doing a Venn-diagram-like query,

you don't even have to build an index.

You can just do it right now.

You can--you can even do it from the shell application

right in your deployed App Engine app.

And that's--that great for doing ad hoc queries

if you want to figure out something.

It also reduces cost, 'cause indexes take space,

index has to take time to build.

And you can actually provide

some pretty advanced functionality

with Venn-diagram-like operations.

So an example is what you have in Gmail.

Think of a Gmail filter

or a Gmail search that you can do.

You have various labels applied,

read/unread status, a month, year, day,

number of replies, recipients, et cetera.

These are all properties of a message.

If you think about it,

these are all just Venn-diagram-like overlaps

in sets.

All you're doing is testing set membership.

But you're doing those-- those tests in parallel,

together, in a single filter.

So what's an example of merge-join?

Well, here's my very useful animal class.

I have a list of what it has--

in this case, horns.

The color is spotted.

And it has four legs.

And if I want to do a query to find animals like this animal

I--I can just do this query:

SELECT * FROM ANIMAL

where the color is spots, has horns, legs 4.

Now, that doesn't look like a join.

But it is if you think about it.

And I'll show you what I mean.

Well, I'll show you what I mean in a second

and why it's relevant.

But let's talk about

how this actually works in App Engine.

So...so--well, first let's talk about how

it works in a relational database.

Well, the relational database usually--

the query planner has histograms.

The query planner knows that

I have this many entity--

this many records that have spots in this value.

I have this many records that have horns in this value.

I have this many records that have four legs.

And then it determines a query plan

to do the smallest--

to scan through the smallest result set.

And then it starts just either going through memory,

or through B-trees on disk,

to go and actually do a linear scan

through all of your entities

and then find the matching one.

And that's--that's how-- that's how SQL's doing it.

So...so how do we do that?

We don't have histograms.

Well, first of all, it's not available in BigTable.

There are similar optimizations

that work like this in other DB systems.

But this is some special sauce that we have,

we've had since launch.

So, first of all, we store all property indexes

in sorted order, ascending order.

If you know anything about BigTable,

that should sound familiar

because that's what BigTable does.

So it's-- everything is stored

in lexical order, ascending order.

And then, the datastore

essentially does a merge-sort at runtime.

So we take this sorted list

and we merge it with itself in different locations.

And we use a zig-zag algorithm to make this efficient,

to efficiently join tables.

And what this lets us do is scan a single BigTable index

in parallel, and it's so easy, and--

Yeah. This is really hard.

This is not a Google interview question,

but it sure sounds like one.

And I'm not going to be able to explain this

without actually showing you how it works.

So let me show you what it works.

I can talk about "zig-zag" and merge

and scans and all this stuff,

but you just need to see it to understand it.

So here's-- let's say this is

a BigTable index, okay?

These are--the tables represent property indexes.

Now, this is in ascending order.

You'll see that, for each property,

I have its value: color=red.

And then I have the primary key

of the entity.

So... color equals red ants,

spotted bear, spotted cow,

white dog.

And these are different sections of the same BigTable,

effectively, which is the animal--

the animal property table.

So the first thing I do--

if you go back to the GQL Query--

I'm gonna do--I'm doing three separate equality filters.

I'm doing one on color, has, and legs.

So first I-- so I start with a color.

Then I scan through BigTable to the row

that matches spots.

Then I say, "Okay, that sounds great."

I found spots. I found the first part of my filter.

So this part-- this equality--

this equality test has been matched.

So now it moves on to the next one,

which is has.

And then it scans through and goes, "Ah!

Now I found this key with horns that also matches."

But what's important here,

if you look for number one,

the key I have matched is bear.

For number two, the key I have matched is cow.

Now cow is lexically, alphabetically after bear,

which means that bear is wrong.

We need to actually find the next thing after bear.

But we need to find the next thing after bear

but before cow or equal to cow.

So this is the "zig." We move number one.

We say, "Okay, well, these keys don't match."

Bear and cow don't match.

So we'll start moving the window on number one

so we match cow.

We need those keys to match

because if we don't have a matching key,

we don't have a matching result.

So now we're satisfied again with two filters.

We have color equals spots, key is cow,

has is horns, key is cow.

We've--we've satisfied these two equality tests.

But now we move on to the next one.

We start off with legs=4.

The first thing-- result we get is a cat.

Again, the key does not match.

Lexically, the key cat is before cow.

So again, we do a BigTable scan

to scan after this-- this row

but before a--a cow,

which are the other results we have.

And that's the "zag."

And that moves--that moves the--that iterator down.

And now we have a match.

We have--the key-- the keys all are the same

and all of the equality tests are satisfied.

So we've joined this table with itself.

Okay. Does that make sense?

Cool.

So I like zig-zag. I think it's pretty cool.

Good name. Good word.

So let's go with the concrete example.

What would you actually use this for?

So let's talk about modeling the social graph.

So a social graph is just a good example for this.

You can model all kinds of graphs.

Graphs are very useful for doing all kinds of things.

So anything you can represent with nodes

and relationships between nodes, vertices, whatever,

you can-- you can do this way.

So a social-- the social graph's useful

because the questions are easy.

So, you know, each user has a profile

and a set of friends.

We're gonna use merge-join on list properties

to do some magic for us.

So merge-join lets us answer all these great questions

about relationships.

First of all, who are my friends?

Who are my friends in San Francisco?

Which friends do I have in common

with somebody else?

Which friends do I have in common with somebody else

in San Francisco--

where somebody else is a specific person?

And if you look at a lot of social sites out there,

these are the queries you see.

These are the-- you know, the--

on the main profile pages and dashboards.

They say, "Hey," you know,

"you have these friends in common," and so and so.

Now, for simplicity I'm just gonna do

relationships two-way.

There's no reason you couldn't use this for a DAG.

It's just easier this way.

So...

here's my group of friends.

These are the people:

Mel, Bob, Stu, Willie, Lenny, Carl.

Maybe the names are familiar.

They live in three locations.

The lines are the friendships between them.

Now, if we want to answer a query like:

Who's a mutual friend of Willie and Carl?

That's easy. It's Lenny.

There he is.

You see that relationship with the arrows.

Similarly, if we want to figure out

who's a mutual friend of Bob and Willie in San Francisco,

we know that's Stu.

Now, you see how this is a Venn-diagram-like query.

That's what we're figuring out.

But we're operating on a graph.

Now, how would you do this with a relational system?

You'd have a person table,

which has the user and their location,

and then you'd have a table of friends.

You'd normalize it somehow

to make sure you don't have duplicates,

but you'd have User "A" and User "B"

and their friends.

And then to-- so then simple queries like:

Who--who are the friends of user "X"?

You'd just do a simple join

between the users table and the friends table.

You say, "Give me all the users,

"join on friends,

where the user I.D. matches the friend I.D."

And--and so here,

the first--the first part I have, user_b_id,

that's the join property.

And then the constraint is user_a equals "X" equals me.

So show me all my friends.

Show me all the friends of user "X."

And if you want to filter by location

you just add another filter to the users table.

And that gets more complicated

when you want to do a graph query.

If you want to do a graph query, now,

you want to figure out:

Give me all the friends common to "X" and "Y."

So you SELECT * FROM Users,

INNER JOIN on Friends 1 and Friends 2--

'cause you're joining on the same table

so you need two representations of that table.

You need to figure out where

the user_id equals f1 user I.D.

and the other user_id equals f2 user I.D.

And then you also need constraints

to make sure that you have user "X" and user "Y"

and then an actual merge that actually does the join,

which is that last line.

It's pretty complicated.

But if you're familiar with SQL, that--that works.

But we don't have inner joins on App Engine

like we're using here.

And--and we don't, you know,

and people have said we don't have joins.

Well, we do have merge-join.

We can do self-joins.

The only really important part of this is the last line,

which is joining the two tables together.

So here's how we do it in App Engine.

We have a person, they have a location,

and they have a list of friends.

And we do a really simple query.

So it's very natural sounding.

It's: Give me all the people

where they have this friend and they have that friend

in San Francisco.

That's it.

That answers--that answers the question.

It'll de-dup.

It'll give you unique result sets.

It'll do that merge-join query

and give it to you.

And you can add as many filters as you need

up to a practical limit.

So let's demo this.

I'm gonna kind of go back

so you can actually see these examples.

So...

I have two-- two things here.

So I'm-- let's say I'm Carl,

and I want to see...

and I'm looking at Willie.

And I want to say:

Who are the friends who are common

to...Carl and Willie?

So you remember...

Carl and Willie, right?

Who are the friends common to Carl and Willie?

And it's Lenny.

Oh!

So Carl, Willie-- Lenny is a common friend.

So there we go.

Pretty easy to--to parse.

Similarly, you can see the friends Lenny and Stu

as being Willie's friends.

Now the next one was answering:

Who is a mutual friend of Bob and Willie in San Francisco?

And here I have this tab.

So...friends in common to Bob and Willie

in San Francisco.

And this application will let you

model these relationships

and do all the various queries.

And it shows all the different queries you can do

with a social graph.

And so you can use this

kind of as an example app

for how to do merge-join where merge-join is useful.

So what's the performance?

Obviously this can't come for free, right?

So merge-join scales with the number of filters you have

and the size of the result set.

The result set--

so that means it's best for queries

with fewer results,

say, less than 100, less than a few hundred.

If you have a lot of overlapping data

or a lot of results that could potentially match the merge-join

that could be problematic.

But usually that's okay

because a lot of these queries-- you're trying to find overlaps.

You're trying to add more and more equalities filters,

trying to narrow that set of messages.

You're looking for a needle in a haystack.

This is really useful for answering those questions.

And the great thing is

it has similar performance as list properties.

So it has the same read/write speed.

There's no storage overhead.

There are no extra indexes to write.

And exactly in the same way with list properties

and relation indexes,

you can avoid all the serialization costs

by using relation indexes.

So you can do all this for exactly the same costs.

Now, there are some gotchas.

You need to watch out for pathological datasets.

Too many overlapping values causes a lot of zig-zagging,

like I said.

That's, effectively, where

we have to keep scanning through the keys to find matches,

and we have to keep going

between our two different windows in the merge-join

to find an overlapping set.

I think that if your result set's small,

then the chances of this are very low.

But you should go back through this, you know, merge-join

and just think through, you know,

what it's doing to understand the potential for this.

Effectively, it's-- when you have to keep scanning

and scanning and scanning inch by inch

to find matches, back and forth.

Another thing that's really important

is that this doesn't work with composite indexes.

So the--the exploding index combinations

won't work here-- keep this from working.

And that's because when you have multiple equals filters--

we're using merge-join to do the merge.

Now, the second you apply a sort order,

or you have any kind of inequality filter

you need to add to this,

it breaks down, because we need to--

we need to store, in our index, the--

the cross product of all of those list properties.

So...so that gets really, really big

really fast, like I said, and it won't work.

What that means is you can't apply sort orders

to some things that are used this way.

Now, that's--that's not-- that's not good.

But there's a couple of things you can do.

So, you know, you can sort in memory.

For a small results set, that's okay,

especially if you use relation indexes.

The--the actual size of that data is very low

and it's cacheable.

The other thing is that, if you remember

from the relation index code that I showed you,

first we got all the keys and then we transformed the keys

and then we retrieved the actual entities we care about.

There's no reason you can't do some filtering up front.

If there's something that you know about your keys

that you can filter them further,

you can do that filtering way in advance

and, you know, filter at the key level

before you even retrieve the--the entities.

So that--that's another way of optimizing this.

But...it's really-- it's really best

for membership tests, Venn diagram tests

where the sorting is done after the fact.

Okay.

So we're gonna do a little bit of wrap-up

and then we're gonna get to some questions.

So I encourage you to use list properties,

use merge-join for many things.

I believe that this can solve fan-out for your application.

It'll let you scale fan-out problems very well,

especially when combined with our Task Queue API.

This'll let you, you know,

have a large set of entities

that match a certain index

without a problem.

You can use this same idea to match geospacial info.

So instead of a list of my friends,

I could have a list of coordinates,

or a list of geographical regions,

or bounding boxes on a globe,

and do queries on that.

Again, these are set membership problems.

Now if I add merge-join to the party,

I can do relationship graphs.

I can do a bunch of graph operations

that are very interesting.

I could also do "fuzzy" values.

So if you have any-- any, you know, fuzzy values

that you need to store,

you can do that with list properties.

For instance, I can store today's date.

I can store the month, today's date,

or I can store the year, the month, the day,

today's date, you know,

the hour, the minute, the second--

I can store a whole list

of all these different resolutions of data

and then query on any one of them

to get a different subsection of the data.

So a lot of people, when they do a query on time,

they say, "Oh, give me everything

between these two times."

And that's not the right way to think about this.

You need to think about converting your theories

into set membership tests.

So compute those memberships at write time

and you'll enjoy very fast reads.

If you want to find everything from today,

just mark it as being from today.

If you want to find everything from this hour,

when you write the--

when you write an entity to the data store,

mark it as being from today

at this hour.

And then all you have to do to retrieve the data

is a single equality filter.

That's it. There's no--

There's no inequality filters required.

If you want to see the demos,

I have put them up available with source code.

This is hopefully useful.

It takes a little while to get them up otherwise--

pubsub-test.appspot.com

and dagpeople.appspot.com.

It's Python code.

And you can do all these same things in JDO.

And we're working on further optimizations for JDO,

but it's all functionally equivalent.

There's more information on our site:

code.google.com/appengine.

So now let's go to some questions.

Let me load this up.

[applause]

Thanks.

So yeah, there are mics here and here

you can come up to.

Yeah, cool.

So I'll just start off with some of these

and then, if you have any other questions,

please step up to the mic.

And if you want to duck out, thanks for coming.

So...

"What about ReferenceProperty and ReferenceListProperty?

"Are they as efficient as StringListProperty?

Can we apply the same practices you described in your talk?"

Yes, you can. You can.

The only reason you don't want to do this, necessarily,

is because ReferenceProperties include keys.

And keys can be bulky sometimes.

It has the kind, the app I.D.,

and the primary I.D. or name of the entity

in that key.

So in terms of actual storage space,

you know, a key might be, you know, 50 bytes

and a single string might be 10.

So it depends on how hardcore

you're trying to do your optimizations.

But, yeah, you can use ReferenceProperties

if you want to do it that way.

So...

"How well does list property scale?

"How many entries can we put in it

and not suffer in performance?"

Hopefully I answered that one.

For writes, you're gonna pay a cost

to do that serialization,

to put it in the datastore.

But to retrieve it back,

you can avoid that serialization cost

and overhead.

Yeah.

man: I wanted to ask...

in merge-join, about the work clause.

You explained the method, with the zig-zag...

algorithm works.

So it seems that it matters,

the order of the clauses inside the work clause

for the zig-zag to work.

And hence it affects on the performance.

Is that true?

Slatkin: So sorry, could you repeat that one more time?

man: Does the order of the statement in the work clause...

Slatkin: Oh.

man: Affect the performance of the query?

Slatkin: Right. That's a good question.

Yeah, so the question, I guess,

is--does the--yeah.

Does the order of the equality filters

in the merge-join affect the performance?

I believe, yes, it does.

So if you--if you know that you have a certain property

that has a high correlation to the results set,

you should list that first.

I think that we don't-- we don't have histograms

and we don't have a query planner to use them,

so any hints you can give us

on the way to find your results set is--is useful.

And I think that setting the equality--

the order of the equality filters will do that.

I need to confirm that,

but that's a really good question.

man: It's not a big difference.

Slatkin: It's not a big difference?

So, yeah, Ryan implemented it along with Max,

and he says it doesn't make a big difference.

I guess BigTable scanning is very fast

so that could be a good explanation.

Any other questions? Yep.

man: So if you have, say, a list of objects

that each have a price,

and the price is dynamic,

but you want to be able to search on...

objects that have a price above a certain value.

Is it possible to do range queries

without having to build a bunch of fancy indexes behind them,

because the data might be changing?

Slatkin: Yeah, so...yeah.

So that's--yeah, so, you know,

if you have data that's changing,

and you don't want to have to have

indexes for range queries, how do you do it?

So this is--this is kind of the--

the answer to these kinds of questions

usually is precompute your views

at write time.

So if I have, you know,

an example would be an auction site.

Show me everything between the price of 0 and 5

5 and 20, 20 and 100,

100 and 1,000.

You know that those are going to be your views.

When you update the price of an item,

add it--add that slice of data to your list property.

And then you can query just for set membership.

You don't have to do a range query.

And if you need to change your ranges later,

or add more ranges,

you just can write some relation index entities

to do that,

or you can update the ones you already have.

And our Task Queue API that we'll be talking about tomorrow

will greatly increase the-- your ability to do that.

But yeah, I mean, when you have code

to update that index,

just--just--you know,

re...just, you know,

change--change the way you're slicing your--your fuzziness.

That's what a fuzzy query is.

That's a really good example of one.

We got some negative questions. That's good.

[chuckles]

"Doesn't the relation index entity

suffer from the n+1 select problem?"

Who asked this question, and could you please come to a mic?

I don't know the n+1 select problem.

Anybody? You, okay.

Can you tell us--what is the n+1 select problem?

man: When you select-- when you select the...

the actual number of elements first,

and then it does the select for each of the element.

If you go back to the slide...

Slatkin: Yeah.

man: There was a "for" loop order.

Slatkin: Oh.

So...

you're talking about...

relation index... so this one.

Is this what you're talking about?

man: Yes, this one. There's a "for" loop, right?

Slatkin: This loop. That's what you're talking about?

man: Yep.

Slatkin: So that loop is just going over

the index result set.

Those entities have already been...

So this is a list of keys, these indexes here.

They're not separate queries.

They are not--they don't require SELECTs.

man: And when you do the db.get of keys

you actually get everything at once.

Slatkin: Exactly. So...so you're doing one SELECT,

one set of BigTable scans-- actually, one BigTable scan

to get all the keys.

And then you're taking all those keys,

transforming them, and doing one, like, hash table lookup

to pull all those values back.

So you don't have-- there's no n+1 problem.

It's just--it's one query, one get.

Cool.

I have one more...

I think this one's a little out of scope,

so, yeah, I can answer this for you later.

This one: "You talked about avoiding list deserialization

"for--you talked about avoiding list deserialization for reads.

"What about for writes?

"Can we add a list value without deserializing

and serializing the entire list?"

So there's a trade-off here.

You could just write another message index entity.

That has a lot of costs associated with it.

So you're paying-- the trade-off--the trade-off is

storage space or CPU time.

You could write a new entity

with that index value,

a new relation index...

when you need a new index value.

But then that's a whole other entity

and it has all of its associated indexes

and costs of the keys and so on.

So you need to figure out that trade-off for yourself.

If you don't really care about storage space so much,

then that's a really good solution.

But storage is usually pretty cheap.

So I could see the trade-off.

But writes are also very infrequent,

so it could make sense to update things.

It would be awesome if we had a way

of just appending to a list property

without having to retrieve it.

That would be the ideal solution.

And then we wouldn't have to do any of this.

You'd just say, "Oh, yeah,

"append this value to a string list

"and don't even come back to me.

I just want you to do it."

And that-- that would be--

that would be one way to solve this also.

Yeah.

man: Yeah. What makes a relation index entity

better than just a join entity,

where you would have the reference to the message

and the reference to the user?

Slatkin: Right, so...

on the join entity-- so yeah.

So--so the question is, yeah.

What makes a relation index better than a join entity?

A relation index, effectively, is a join entity.

That's what it's doing.

It's serving the same purpose.

The reason that it's useful is that it's a child of the parent.

So if you saw...oh.

Right here.

The message index is a child of its parent.

So we can scan for message indexes very quickly,

get their keys, and then figure out what their parents are.

We can de-reference that pointer really quickly

in memory with a minimal amount of serialization.

If you do this the other way,

with just, like, one-to-many relationships,

or with queries, or...

yeah, with a standard, like, one-to-many relationship style.

Because we don't have joins,

you'd have to first do a query

for ten items,

and then do a query for each of those results

to actually find the references.

And that's--that's the problem.

So...so, yeah, MessageIndex

is essentially a join table.

It's a speed-up, like that.

Yeah.

Dredge: Hi, Ert Dredge from Universal Metaphor.

Do you have suggestions

when we're doing our initial data architecture

for helping to implement these suggestions

that might not be obvious from the talk itself?

Slatkin: Yeah, that's a great question.

So are there optimizations that you should make

when implementing your data

or modeling your data not captured here?

Well, there are a few things.

There is a few kind of hidden-- hidden parts of this

that--that we need to be more clear about.

For instance, how do you compute the cost of indexes?

And how do you-- like, storage costs.

And how do you compute the cost of...

also the datastore storage itself?

We're working on making this more clear to people

so they can actually kind of calculate their index--

their--the index size that their data would have.

And how to figure out

the overhead of having composite indexes

and list properties.

So that's something that would be best for you--

when you're modeling your data, you say,

"Oh, well, if I need this composite query,

that could explode," or that, you know,

"that's gonna require an extra row,"

and then that has a large cost.

Another thing to remember is that

in a standard SQL database,

primary key is usually just an integer

which has a very small amount of storage space.

With us, we also have I.D.s,

numerical I.D.s that are primary keys,

but our keys, like I said,

encode the full path to the element.

That includes the app I.D., the kind,

any child and parent relationships are in there,

finally down to the primary key.

And so, it's useful-- it sounds funny,

but it's useful to actually use kind names that are short

and to use key names that are short,

because it reduces the amount of index space

you have to write,

which is kind of a funny thing.

Another one that's kind of similar

is there's a lot of really interesting things

you can do around encoding.

So one of the ones I know-- I forget the name of it.

If someone remembers it, shout it out.

There's this type of encoding

where if you know something's going to be--

happen really frequently,

you should just--you should make its representation

be as short as possible.

There we go. Huffman encoding, yeah.

So if you can do that--

if you know that, you know,

like, you're gonna have a celebrity

and they're gonna be using your microblogging system

all the time,

or you have one node that's connected to everything,

you want its primary key to be very short.

Because its--its value is going to be

in a lot of different lists.

It's going to be in a lot of relationships.

It's going to have a lot of index rows.

And you want to minimize its size.

And so if you can use that kind of encoding technique

you can really minimize your storage space a lot.

Cool.

Over here.

man: Is there any way to do like queries

or full text search?

Slatkin: To do like queries or full text search.

So, yeah, so the best answer we have--

there--okay, three answers.

So the first answer is that we have

a "poor man's" full text search,

which is the Google App Engine ext search module

that Ryan Barrett wrote.

That just uses merge-join, like I've described here,

to do full text queries.

It doesn't have ranking and ordering,

and it has a bunch of other problems,

but it's really good for simple stuff.

And, like I said, it's good for a small result set.

So if you know that you're only gonna get

20...20 results, 100 results,

and then can sort them in memory, it's great.

In terms of full text search,

then you need something like a real full text search.

This is, you know... you need ranking

and you need all kinds of duplicate elimination

and all kinds of stuff.

This is something that we know our users want,

we're asked about all the time.

I don't have anything to announce

but we know you want it.

You know, "This is Google.

"They don't have full text search.

What's going on?" Yeah, we know.

[laughter] It's hard, actually.

So...

we're--we're--that's the best answer I can give you.

I think there's a number three, but I forgot what it was.

See if there was...

where did that--ah.

Is there anything else...

[indistinct chatter]

Oh, this one's putting it down.

"One of the reasons for you giving us list properties

"is to save storage space.

"Since Google is handling storage, why do we care?"

Billing is enabled.

You can store...yeah.

You can--you can store gigabytes of data.

It's not just one gig of free quota.

Quotas--you can get that quota really high now.

And it's pretty cheap.

So you can store a lot of data without even knowing it.

So...yeah, you know.

We try to instill the best practices

in our programming model, in our APIs.

But we can't always do that,

and so this is kind of a cautionary tale

to tell you to-- to think about storage

before you-- before you actually

rack up a big bill or something.

I think I'm out of questions, and...

so thanks a lot for coming.

Again, I really appreciate it. [applause]

Temat: Wybierz temat ↓

Poziom:

Wszystkie

Google I/O 2009 - ..Scalable, Complex Apps on App Engine