Slatkin:
Welcome to my talk.
This is "Building Scalable,
Complex Apps on App Engine."
I am Brett Slatkin.
I am a software engineer
on the Google App Engine team.
And...yeah.
So let's just jump right in.
So the agenda--
here's the agenda.
So we're gonna talk
about two subjects today
that are, I think,
really interesting
for building
App Engine apps.
The first one
is list properties.
We're gonna talk
about what they are,
how they work.
I'm gonna give you
a concrete example
of microblogging
to illustrate
why they're useful.
And then we're gonna talk
about maximizing the performance
of list properties,
with some detail in there.
The second topic
is merge-join.
We're gonna talk
about what it is,
how it works,
and list property magic--
some really cool things
you can do with App Engine
that people don't realize
is actually possible.
And the concrete example
there that I'm going to show
is modeling
the social graph.
And again,
there's some other detail.
I didn't get hooked up
on the Moderator thing
in time for this, but I have
just set up my own here--
tinyurl.com/complextalk.
So please hop on,
put in some questions,
if you're confused
by anything,
vote up on anything,
hopefully that'll work.
All right, so let's--
let's go in here.
So I guess the one little thing
I wanted to say
before all this was...
it's been
a really exciting year
on App Engine,
on the App Engine Team--
seeing all the developers
do all these things,
seeing our developers
wrap their heads around
all the crazy things
that we ask you to do
with our datastore.
I am gonna make this
even harder for you
to understand,
maybe.
I'm gonna just throw you
into the fire a little bit
because people
seem to like that.
So if I'm going too fast
or it's a little complicated,
please ask me a question--
especially on Moderator--
but hopefully
it'll make sense.
But, you know,
if it doesn't make sense,
hopefully I'll explain it
in a later slide
and so, yeah,
just bear with me
is what I'm trying to say.
Okay, so what
is a list property?
So it's just another property
in the datastore.
It has multiple values.
So it's an ordered list
that maintains its order.
So when you write the entity
or you retrieve the entity,
it comes back
in the same order--
this property
comes back as a list
in the same order
that you wrote it.
Now, when you query,
it uses an equals filter.
But the interesting thing
is that when you do that query,
it matches any of the possible
values in that list.
So the entity will match
for any individual item
in that list.
And...and one of the things
that's also funny
is that--because of the way
that App Engine works,
the default sort order
on list properties
is not very useful.
It's kind of complicated,
and we'll get into why.
So you have to sometimes
use composite indexes
if you want
some distinct ordering
with list properties.
You'll understand
what I mean in a minute.
But they're
very easy to use.
I mean,
here's an example right here.
I have a "Favorites"
model class.
I have a list of colors,
and then I have the user
that those favorites belong to.
If I want to assign
a list property,
I just--it's a list;
I just assign it.
That's it.
That's all there is to it.
Now, why would you
want to do this, you know?
Like, this is
a many-to-one relationship
that I am modeling.
Why would you do this
as opposed to
inserting a new entry
in the database
for each--for each part
of the one-to-many relationship?
Well, there's a few reasons.
First, you can densely
pack information.
You can track lists
of related items in a way
that's much more efficient
in terms of space and CPU.
You could also use
multiple lists in parallel
to store "tuples."
So two lists in parallel
would be a pair.
Three would be a triple...
so on.
It's also just
kind of easier to use.
If you look at this example,
this would be a standard
one-to-many relationship
in the datastore.
Here again I have the--
a single entity
representing
a favorite color.
And I would write multiple
of these for each user.
And then to find the--a list
of somebody's favorite colors,
I would actually
have to do a query
and retrieve the data.
So, you know, that's
a lot more complicated
than just saying
.colors.
But, more importantly,
list properties are great
for answering
set membership questions.
And this is where
the queries come in.
So which colors--
like, here,
an example
is which color is like--
which users
like the color yellow?
And so here
is an example.
I say, "Give me
all favorite colors
where the favorite color
is yellow."
It gives me back
the results,
and then I can pull out
the usernames.
It's a simple query
to give me
some pretty
interesting information
very easily.
Now, this seems
really simple,
but this actually
has a great fan-out capability.
I can--I can cut
across all of my data.
I can match kind of any--
any value of yellow
in users' lists
of favorite colors
across all FavoriteColor
entities.
So we'll get more into
what "fan-out" means later.
But there's--there's some
really useful things in here.
Now, why would you
use list properties,
again, over the normal
one-to-many relationship?
Well, first of all,
each list item
only has
an index entry.
So normally when you take a--
the App Engine datastore
and you write an entity,
you're writing
into multiple indexes.
You write the entity itself
into the entities table
and you write
into the "by kind" index.
And you write into each of
the different property indexes.
And you write
into composite indexes.
And so there's
a bunch of overhead
with actually
storing that entity.
And with a list item,
you only actually
add one index row.
That's it.
So--so where, you know,
a normal entity, right,
would have, say, five,
this would only have one.
So less storage.
You also save
the space of the keys
of that entity in that index,
which also saves you
more space.
And so, ultimately,
you're just saving space.
And it's easier to understand.
It's just a list.
There are a couple
of gotchas, though.
It doesn't come
for free, you know.
It's all about trade-offs.
So, you know,
list properties use more CPU
for serializing
and deserializing
than--than doing it
the other way.
When you access the entity,
you have to serialize
and deserialize the whole list.
We'll get into what that means
and why that's bad.
The other thing is,
like I said,
the sort orders aren't
very useful on their own.
So if you--
if you want to use
list properties
and sort orders,
you need to use
a composite index.
And you need
to be careful that
you only use
one list property
in a composite index
or else you get
this "exploding" problem.
What is exploding and
what is an exploding index?
So here's the idea:
You have a list property
with 1,000 entries in it.
And then you have
another list property
with 1,000 entries in it.
If you want to build
a composite index on both,
you need the--the cross product
of all of them.
So you actually need
1 million index entries
to represent
all the possible combinations
of how
those could pair up.
And that's--
that's the explosion
in storage space.
So watch out for that.
But there's a lot of stuff
you can do that's useful
with creating
on just one list property.
And I'll explain.
So let's get into
our concrete example.
I feel like
every conference I've seen
in the last two years
has somebody explaining
how to make your favorite
microblogging service scale.
For real, this time,
on my platform,
I--I swear to you--
I guess I'm guilty of that.
But microblogging is actually
a very interesting example.
So microblogging--
I mean stuff like
Jaiku, Twitter, identi.ca,
some of the other ones
out there.
And it's basically
a very high speed feed
of small messages.
It represents the publish
and subscribe use case.
It's also broadcast,
multicast--
kind of the same thing.
The general idea is
the user sends a message
and it goes
to many other users...
a lot of users.
In some cases,
millions of them.
And it's a great example
of fan-out,
because
one simple user action--
typing in your 140 characters,
pushing return--
causes a lot of work
to happen,
causes a lot of data
that needs to be surfaced,
a lot of people
that need to be notified,
and it's so easy to do
that you're constantly trying
to catch up with this work.
That's why
it's hard to scale this.
And that's just
the nature of fan-out.
It's hard.
So let's talk about
an example of fan-out,
kind of the naive way
of doing it.
Now, fan-out is hard
because it can be inefficient.
You can require
duplicate data.
And so this is kind of
the email example.
If you imagine
that each of these users--
"B," "C," and "D"--
are different email users
on different email servers,
when SMTP
wants to send a message,
what does it do?
Well, it sends
a copy of the message
to each SMTP server.
So that's copying
all the bandwidth, all the data.
Each SMTP server
needs to manage it differently,
and all the users
get their own copy.
And eventually they download it
to their own machine,
and there's just a lot
of duplicate work happening.
Now, this is
kind of a testament
to why SMTP works so well.
It's scaled just fine.
But the throughput's
kind of lower in some ways.
And the fan-out's
usually pretty low.
People don't usually send
emails to a million people.
They send it to 100
or a listserv.
A better way
to do fan-out is like this.
You know,
not duplicate any data.
So ideally
we'd kind of do--
do message sending
by reference.
We'd have user "A"
send a message.
The message would be--
have all of the different...
users who should get it
annotated on it,
And then they would just
receive it by reference.
We would be able to figure out,
"Oh, here are the messages
that user 'B'
should receive."
So let's talk about
how to do that model
using a traditional
relational database.
This is probably
what it would look like.
There's probably
a better way to do it,
but this is the first thing
that I think would come to mind.
You have a table of users.
You have a table of messages.
Then you have the table of
messages that users should get.
So each user has an I.D.
and some properties--
their name, et cetera.
You have a table
of the messages,
the body of the message,
title,
time it was sent,
et cetera.
And then you have a--
the relationship
between these two
primary I.D.s,
which is the join table.
So how would you
do this in SQL?
Well, if you want to find
messages for the user "X,"
you would do a join.
You'd select from messages
all the user messages--
select from messages
all messages
with a "join"
on the UserMessages table
using message_id to join
the two tables together.
So that's basically
connecting the dots.
And then you would say,
"Now only give me
the ones for user 'X.'"
And that's how you
get that list.
But there are no joins
on App Engine.
So how do you do this?
How would you
implement this on App Engine?
The answer
is list properties.
They--they come
to our rescue here.
So with App Engine,
I define
a simple message class.
This is a--
just a DB model.
I'm doing this in Python just
'cause it's a little shorter.
I say,
"Here's my sender."
I have a list
of receivers.
These are the people
that should get my messages
by reference.
Then I have a body, which is
the actual body of the message.
Now, to figure out
what messages I get,
I just do a simple query.
I say, "Give me all the messages
where I'm a receiver."
Seems pretty natural.
There are no joins.
It's just--it's just done.
And this is essentially
how Jaiku works
at its core--
this simple idea
of doing fan-out
with list properties.
Now, with JDO
you can do the same thing.
Here, again,
you have the autogenerated I.D.
of the message.
You have a sender,
you have the body,
and you have
a list of receivers.
Exactly the same idea.
To do the query--
I'm sorry.
Everyone get it?
All right.
To do the query
you do exactly the same thing.
It's a list property
with an equals filter.
You get
your PersistenceManager,
you create your query
on the Message.class,
you filter on the receivers
using an equals filter,
and then you
get back a message.
And at that point,
results.receivers
is the list of receivers
that you want to--to access.
Okay, so let's have a--
let's have a go.
Let me show you the actual--
let me show you it in action.
So this is
a little app I wrote
called
"Publish-subscribe test."
The source code's
available online.
I have a link to it later.
And this lets me demonstrate
how this works.
So to make this simple,
I just made it
so that you specify the number
of receivers you want.
And it'll go
and send that message
to that many receivers.
So here I can
have my nickname: Brett.
I want to send it to,
say, 100 receivers,
and say, "Hi from I/O."
Then I submit.
Great. Message sent.
So now if I look
at Message 1,
I can retrieve it.
And then I have
a list of them here.
I think--I think I have
too many examples in here.
I didn't do paging yet,
but it's in there.
And I'll get to why
I have all these--
all this demi-data
in just a sec.
Right.
So pretty simple.
Really easy.
Really simple query.
Easy to do writes.
So let's talk about
the performance of this, though.
So index writes are done
in parallel on BigTable.
That means that it's fast.
It's really fast.
You can update a list property
with 1,000 items.
That'll actually do
1,000 different writes
to BigTable simultaneously
in parallel,
which is--
that's a lot of throughput.
The write time complexity,
cost CPU time, the latency--
it scales linearly
with the number of items.
And right now we limit--
we limit you to 5,000
indexed properties per entity
kind of for safety.
And the storage cost
is about the same
as a relational
database system.
In a relational database,
you'd have that
user messages table.
You'd have the user key
and you'd have the message key
in the join table,
and that would be your cost
of sending a message.
You'd have to write
different rows for each person
that needs
to receive the message.
In the datastore,
it's exactly the same way.
You have the entity key
and then you have
the list property value--
the person
who should receive it.
And those are the only things
that you need to store.
So it's just primary keys.
But there's a downside--
I kind of hinted at it--
Serialization.
So the problem is that
writes have
to pack your list
into one serialized
protocol buffer.
Protocol buffers
are an internal tool--
it's now external,
available for you.
It's a very quick
serialization
marshaling
kind of protocol.
And it's really fast,
and it's space efficient.
But even so, if you
have 1,000, 2,000 items,
that packaging
can take a little while.
Now, for writes
this is okay
because writes
are relatively infrequent.
They're a lot less
frequent than reads.
The killer here is that
queries also have to unpackage
all the result entities.
So if you remember
our message class--
I'm gonna zip back here.
Our message class
has a list of receivers.
If you have
1,000 receivers in there,
that's gonna take
a little while
if you receive just one.
Now, what if you
want to take 100 messages?
Then you have to deserialize
100 times 1,000--
100,000 different
list property values--
from your entities.
And that serialization cost
gets pretty high.
So we've got a problem.
And it's--
it uses a lot of CPU.
It uses a lot of latency,
datastore latency,
so end user latency
is pretty high.
It's--it's--
it's just not very good.
And so now you're saying,
"Well, why are you telling me
about all this stuff
if I can't use it,"
right?
Well, luckily,
I have a solution.
But we'll get to it.
So really,
when I query for messages,
I only really want
the message information.
I don't really care
about the index
that got me the information
after the fact.
The problem with App Engine:
a little--in a bit
is that, you know, you're
carrying your indexes with you
because you're using
a denormalized schema
a lot of the time.
The receivers list
is a denormalized index.
And when you
retrieve an entity,
you pull
that whole index down also.
But we don't really care
about the list property
after we query.
We, you know, we only
care about the message body,
the sender,
and metainformation.
So what if we could selectively
skip certain properties
when we're querying?
So, for instance,
the receivers list--
just don't give it
back to me.
Then we could avoid
the serialization cost.
Then we'd just get the message
bodies that we care about.
And ideally we could do
something like what SQL does,
where we can just say,
"Hey," you know,
"give me only these
columns of the table."
But you can't do that.
We don't have support
for that yet.
Still trying to figure out
how that works
with our whole all-around
scheme in the DB module.
But we have a solution.
And that's something called
the relation index entity.
This is kind of
a new thing.
I don't--I don't know
what the best name for it is,
but I'm calling it
a relational index entity.
So the idea
is really simple.
Let's split the message
into two entities, okay?
The first one
is the message model.
That contains
what we care about.
It has the body,
and the sender,
and the metainformation that
we actually want to access
once we've done the query.
Then we have another one,
which is the MessageIndex.
And the MessageIndex
just has
the fields
we want to query on.
Nothing else.
So here they are.
You can see,
I just split it in half.
That's all I did.
Now, the key thing
is that we put them
in the same entity group
so that we can transactionally
operate on both of them
at the same time.
And we make the MessageIndex
a child of the message.
What that means is that
the MessageIndex primary key
implicitly points to the--
to the messages--
to its parents'
primary key.
In the App Engine datastore,
all primary keys
include the full path
of--of that entity
and its parent elements.
So...so in this case,
if we had
just the MessageIndex,
I know who its parent is
just by looking at the key.
All I need is the key.
So a new thing
that we just launched--
I think it was in
our 1.2.2 launch
just a couple of weeks ago--
was something called
the key-only query.
And a key-only query
lets me do...
a key-only fetch
on the message index type.
What does that mean?
So I want to do a query
and only get back the keys
that match.
Then I want
to transform those keys
to de-reference the pointer
to their parents, essentially.
And then I want to fetch
the actual messages
that I care about in batch,
very quickly.
And so here's
an example of that.
I do a GQL query.
I say,
"Select the key
from the MessageIndex."
Just show me
the matching entities.
Don't pull down
the whole MessageIndex entity.
Don't--don't deserialize
the list properties.
Just avoid it.
Just give me the keys.
And then, you see,
on the next line I say
keys = k.parent.
That traverses the key
to give me the parent entity,
which is a message.
And then I say
db.get(keys)
and that gets all of those keys
in parallel.
Now, under the covers,
this is what App Engine's
datastore is doing anyway.
This is--this is--
this is how we work.
We--we scan one index,
we figure out
all the keys you need,
and then we grab them all from
the actual entities table
in parallel.
Ryan Barrett's talk
on "Under the Covers
of the App Engine Datastore"
gets into some--
from last year
gets into some really
interesting detail
about how that works.
But hopefully
you can see the efficiency
we're gonna get from this,
because we've
completely avoided
all of the list property
deserialization.
We can--
we can just figure out
what--what entities
we need to get,
go and retrieve them,
and then--
and then pull down
the part of the message
we actually care about.
So I--I have this written up
also in that same demo.
And what's cool
is I have a little check box
to do either--
either of the examples.
So I have a little timing box
I can click on.
Then I click "Retrieve".
So you'll see this
is retrieving, I think,
ten list property entities,
ten message entities.
Each one has
2,000 receivers in it.
And it takes 1.3 seconds.
That is a really long time.
That is way too long.
1.6--
so it's got some variants.
It's--it's way too slow.
That's just for ten.
But if I change this
to use the key-only ones--
again, I have ten entities
in there--
oh, need timing on.
You see it's a lot faster.
It's--it's ten times faster.
It's--it's sometimes faster
than ten times faster.
It's--it's way better.
And this is because I'm avoiding
all the serialization costs
associated with
serializing and deserializing
those list properties.
So what's the conclusion here?
Relation index entities
give us
much better performance.
We have the same cost
for doing writes
but reads are ten times
faster and cheaper.
We have--ten times
faster and cheaper
than the old way
of doing it,
where we had to do
deserialization.
And we have the best of both
worlds with list properties.
We have very low
storage costs
because we only write
one index row
for each list value
and the low CPU cost
because the queries
are really easy to do.
So this is a way
of doing fan-out.
It's a very efficient
way of doing fan-out.
Now, what if we want
to keep going further?
What if we need
more fan-out?
I said there was a limit
of 5,000 entit--
5,000 index rows.
Well, in that case,
if we need more indexes,
you can just write
multiple index--
relation index entities
per message.
There's nothing stopping you
from having multiple
child entities
that refer back
to their parent.
You can add these index entities
in the background
using our new
Task Queue API
which I'll
talk about tomorrow.
And I think this
provides a solution
to the million-fan-out
problem.
I think that
the million-fan-out problem
is kind of
a really hard problem
that people have been
talking and talking about
how to solve for a while.
And I think
this'll let you do it.
I think the other thing
that's cool about this
is that you don't have
to migrate your schema to do it.
If you say, "Hey,
I need an index on 'X'"
suddenly
for this user,
just go through and write
a new relation index
for each of your users
as a child element.
That's all you have to do.
You don't have to change
any of your schema.
They're just
totally decoupled--
your indexes and your data.
So the final picture
of what you'll have
is you'll have the message,
which is what you care about,
and then just
a series of indexes
written as child entities.
And you use those child entities
as references
to the parent
to find the parent,
but then don't--
you don't deserialize them ever.
And--and what
you're doing here really
is kind of accessing
a lower-level
BigTable-like interface
of doing row range
and index scans.
So if you have a problem
doing fan-out,
doing set membership
fan-out,
I encourage you
to look at list properties.
Okay.
This going okay so far?
People with me?
Yeah?
I see some nods.
That's good.
Okay, so merge-join.
Merge-join
is a funny word.
Merge-join is--
is a--is the secret sauce
that I don't think
people realize is there.
So what is it?
Well, people say,
"Oh, App Engine,
"I'm not gonna use
App Engine.
App Engine
doesn't support joins."
Well, that's not true,
actually.
We don't support
natural joins
or inner joins
or outer joins.
Those are useful joins.
We don't support those.
But we do support
merge-joins.
A merge-join is a self-join.
We can join tables
with its--
we can join
a table with itself.
And what that means
is you can combine
multiple equality filters--
multiple equality tests
in a single query.
And so what
that lets you do
is determine Venn-diagram-like
overlaps in sets.
So an example
is something very useful
for your business
application:
I want to see the overlap
of spots, horns, and legs.
Obviously that's a cow--
four legs.
Now, if I want to do a query
to figure out this data
in some taxonomy
of animals,
merge-join is what
you can use to do it.
I'll show you why.
But why would you
want to use merge-join?
Well, it's great
for exploring your data,
data mining-like
operations.
There's--the practical
limit of equality tests
is pretty high.
You can have ten or more
filters pretty easily.
And you don't have
to build indexes in advance
to do these queries.
So if you're doing
a Venn-diagram-like query,
you don't even
have to build an index.
You can just do it right now.
You can--you can even do it
from the shell application
right in your deployed
App Engine app.
And that's--that great
for doing ad hoc queries
if you want
to figure out something.
It also reduces cost,
'cause indexes take space,
index has to take time
to build.
And you can actually provide
some pretty
advanced functionality
with Venn-diagram-like
operations.
So an example
is what you have in Gmail.
Think of a Gmail filter
or a Gmail search
that you can do.
You have various labels
applied,
read/unread status,
a month, year, day,
number of replies,
recipients, et cetera.
These are all
properties of a message.
If you think about it,
these are all just
Venn-diagram-like overlaps
in sets.
All you're doing
is testing set membership.
But you're doing those--
those tests in parallel,
together,
in a single filter.
So what's an example
of merge-join?
Well, here's my very useful
animal class.
I have a list
of what it has--
in this case, horns.
The color is spotted.
And it has four legs.
And if I want to do a query
to find animals like this animal
I--I can just
do this query:
SELECT * FROM ANIMAL
where the color is spots,
has horns, legs 4.
Now, that doesn't
look like a join.
But it is
if you think about it.
And I'll show you
what I mean.
Well, I'll show you
what I mean in a second
and why it's relevant.
But let's talk about
how this actually works
in App Engine.
So...so--well, first
let's talk about how
it works
in a relational database.
Well, the relational database
usually--
the query planner
has histograms.
The query planner
knows that
I have this many entity--
this many records
that have spots in this value.
I have this many records
that have horns in this value.
I have this many records
that have four legs.
And then it determines
a query plan
to do the smallest--
to scan through
the smallest result set.
And then it starts just
either going through memory,
or through B-trees on disk,
to go and actually
do a linear scan
through all of your entities
and then find
the matching one.
And that's--that's how--
that's how SQL's doing it.
So...so how do we do that?
We don't have histograms.
Well, first of all,
it's not available in BigTable.
There are
similar optimizations
that work like this
in other DB systems.
But this is some
special sauce that we have,
we've had since launch.
So, first of all,
we store all property indexes
in sorted order,
ascending order.
If you know anything
about BigTable,
that should sound familiar
because that's
what BigTable does.
So it's--
everything is stored
in lexical order,
ascending order.
And then,
the datastore
essentially does
a merge-sort at runtime.
So we take
this sorted list
and we merge it with itself
in different locations.
And we use a zig-zag algorithm
to make this efficient,
to efficiently join tables.
And what this lets us do
is scan a single BigTable index
in parallel,
and it's so easy, and--
Yeah.
This is really hard.
This is not
a Google interview question,
but it sure sounds like one.
And I'm not going to be able
to explain this
without actually
showing you how it works.
So let me show you
what it works.
I can talk about
"zig-zag" and merge
and scans and all this stuff,
but you just need to see it
to understand it.
So here's--
let's say this is
a BigTable index,
okay?
These are--the tables
represent property indexes.
Now, this is
in ascending order.
You'll see that,
for each property,
I have its value:
color=red.
And then I have
the primary key
of the entity.
So...
color equals red ants,
spotted bear,
spotted cow,
white dog.
And these are different sections
of the same BigTable,
effectively,
which is the animal--
the animal property table.
So the first thing I do--
if you go back
to the GQL Query--
I'm gonna do--I'm doing
three separate equality filters.
I'm doing one on color,
has, and legs.
So first I--
so I start with a color.
Then I scan through BigTable
to the row
that matches spots.
Then I say, "Okay,
that sounds great."
I found spots. I found
the first part of my filter.
So this part--
this equality--
this equality test
has been matched.
So now it moves on
to the next one,
which is has.
And then it scans through
and goes, "Ah!
Now I found this key
with horns that also matches."
But what's important here,
if you look
for number one,
the key
I have matched is bear.
For number two,
the key I have matched is cow.
Now cow is lexically,
alphabetically after bear,
which means that
bear is wrong.
We need to actually find
the next thing after bear.
But we need to find
the next thing after bear
but before cow
or equal to cow.
So this is the "zig."
We move number one.
We say, "Okay, well,
these keys don't match."
Bear and cow don't match.
So we'll start moving
the window on number one
so we match cow.
We need those keys to match
because if we don't have
a matching key,
we don't have
a matching result.
So now we're satisfied again
with two filters.
We have color equals spots,
key is cow,
has is horns,
key is cow.
We've--we've satisfied
these two equality tests.
But now we move on
to the next one.
We start off
with legs=4.
The first thing--
result we get is a cat.
Again,
the key does not match.
Lexically, the key cat
is before cow.
So again,
we do a BigTable scan
to scan after this--
this row
but before a--a cow,
which are the other results
we have.
And that's the "zag."
And that moves--that moves
the--that iterator down.
And now we have a match.
We have--the key--
the keys all are the same
and all of the equality
tests are satisfied.
So we've joined
this table with itself.
Okay.
Does that make sense?
Cool.
So I like zig-zag.
I think it's pretty cool.
Good name.
Good word.
So let's go with
the concrete example.
What would you actually
use this for?
So let's talk about
modeling the social graph.
So a social graph is
just a good example for this.
You can model
all kinds of graphs.
Graphs are very useful
for doing all kinds of things.
So anything you
can represent with nodes
and relationships between
nodes, vertices, whatever,
you can--
you can do this way.
So a social--
the social graph's useful
because the questions
are easy.
So, you know,
each user has a profile
and a set of friends.
We're gonna use merge-join
on list properties
to do some magic for us.
So merge-join lets us answer
all these great questions
about relationships.
First of all,
who are my friends?
Who are my friends
in San Francisco?
Which friends
do I have in common
with somebody else?
Which friends do I have
in common with somebody else
in San Francisco--
where somebody else
is a specific person?
And if you look at a lot
of social sites out there,
these are
the queries you see.
These are the--
you know, the--
on the main profile pages
and dashboards.
They say, "Hey," you know,
"you have these friends
in common," and so and so.
Now, for simplicity
I'm just gonna do
relationships two-way.
There's no reason you
couldn't use this for a DAG.
It's just easier this way.
So...
here's my group of friends.
These are the people:
Mel, Bob, Stu, Willie,
Lenny, Carl.
Maybe the names
are familiar.
They live
in three locations.
The lines are the friendships
between them.
Now, if we want
to answer a query like:
Who's a mutual friend
of Willie and Carl?
That's easy.
It's Lenny.
There he is.
You see that relationship
with the arrows.
Similarly,
if we want to figure out
who's a mutual friend of Bob
and Willie in San Francisco,
we know that's Stu.
Now, you see how this is
a Venn-diagram-like query.
That's what
we're figuring out.
But we're operating
on a graph.
Now, how would you do this
with a relational system?
You'd have a person table,
which has the user
and their location,
and then you'd have
a table of friends.
You'd normalize it somehow
to make sure you
don't have duplicates,
but you'd have User "A"
and User "B"
and their friends.
And then to--
so then simple queries like:
Who--who are
the friends of user "X"?
You'd just do
a simple join
between the users table
and the friends table.
You say, "Give me
all the users,
"join on friends,
where the user I.D.
matches the friend I.D."
And--and so here,
the first--the first part
I have, user_b_id,
that's the join property.
And then the constraint is
user_a equals "X" equals me.
So show me all my friends.
Show me all the friends
of user "X."
And if you want to filter
by location
you just add another filter
to the users table.
And that gets
more complicated
when you want to do
a graph query.
If you want to do
a graph query, now,
you want to figure out:
Give me all the friends
common to "X" and "Y."
So you
SELECT * FROM Users,
INNER JOIN on Friends 1
and Friends 2--
'cause you're joining
on the same table
so you need two representations
of that table.
You need
to figure out where
the user_id equals
f1 user I.D.
and the other user_id
equals f2 user I.D.
And then you
also need constraints
to make sure that you have
user "X" and user "Y"
and then an actual merge
that actually does the join,
which is that last line.
It's pretty complicated.
But if you're familiar
with SQL, that--that works.
But we don't have inner joins
on App Engine
like we're using here.
And--and we don't,
you know,
and people have said
we don't have joins.
Well, we do have
merge-join.
We can do self-joins.
The only really important
part of this is the last line,
which is joining
the two tables together.
So here's how we do
it in App Engine.
We have a person,
they have a location,
and they have
a list of friends.
And we do
a really simple query.
So it's
very natural sounding.
It's:
Give me all the people
where they have this friend
and they have that friend
in San Francisco.
That's it.
That answers--that answers
the question.
It'll de-dup.
It'll give you
unique result sets.
It'll do that
merge-join query
and give it to you.
And you can add
as many filters as you need
up to a practical limit.
So let's demo this.
I'm gonna kind of go back
so you can actually
see these examples.
So...
I have two--
two things here.
So I'm--
let's say I'm Carl,
and I want to see...
and I'm looking at Willie.
And I want to say:
Who are the friends
who are common
to...Carl and Willie?
So you remember...
Carl and Willie, right?
Who are the friends
common to Carl and Willie?
And it's Lenny.
Oh!
So Carl, Willie--
Lenny is a common friend.
So there we go.
Pretty easy to--to parse.
Similarly, you can see
the friends Lenny and Stu
as being Willie's friends.
Now the next one
was answering:
Who is a mutual friend of Bob
and Willie in San Francisco?
And here I have this tab.
So...friends in common
to Bob and Willie
in San Francisco.
And this application
will let you
model these relationships
and do all
the various queries.
And it shows all the different
queries you can do
with a social graph.
And so you can use this
kind of as an example app
for how to do merge-join
where merge-join is useful.
So what's the performance?
Obviously this can't
come for free, right?
So merge-join scales with
the number of filters you have
and the size
of the result set.
The result set--
so that means
it's best for queries
with fewer results,
say, less than 100,
less than a few hundred.
If you have a lot
of overlapping data
or a lot of results that could
potentially match the merge-join
that could be problematic.
But usually that's okay
because a lot of these queries--
you're trying to find overlaps.
You're trying to add more
and more equalities filters,
trying to narrow
that set of messages.
You're looking for
a needle in a haystack.
This is really useful
for answering those questions.
And the great thing is
it has similar performance
as list properties.
So it has
the same read/write speed.
There's no storage overhead.
There are no
extra indexes to write.
And exactly in the same way
with list properties
and relation indexes,
you can avoid all
the serialization costs
by using relation indexes.
So you can do all this
for exactly the same costs.
Now, there are some gotchas.
You need to watch out
for pathological datasets.
Too many overlapping values
causes a lot of zig-zagging,
like I said.
That's, effectively, where
we have to keep scanning through
the keys to find matches,
and we have to keep going
between our two different
windows in the merge-join
to find an overlapping set.
I think that if
your result set's small,
then the chances of this
are very low.
But you should go back through
this, you know, merge-join
and just think through,
you know,
what it's doing to understand
the potential for this.
Effectively, it's--
when you have to keep scanning
and scanning and scanning
inch by inch
to find matches,
back and forth.
Another thing
that's really important
is that this doesn't work
with composite indexes.
So the--the exploding
index combinations
won't work here--
keep this from working.
And that's because when you
have multiple equals filters--
we're using merge-join
to do the merge.
Now, the second
you apply a sort order,
or you have any kind
of inequality filter
you need to add to this,
it breaks down,
because we need to--
we need to store,
in our index, the--
the cross product
of all of those list properties.
So...so that
gets really, really big
really fast, like I said,
and it won't work.
What that means is
you can't apply sort orders
to some things
that are used this way.
Now, that's--that's not--
that's not good.
But there's a couple
of things you can do.
So, you know,
you can sort in memory.
For a small results set,
that's okay,
especially if you use
relation indexes.
The--the actual size of
that data is very low
and it's cacheable.
The other thing is that,
if you remember
from the relation index code
that I showed you,
first we got all the keys
and then we transformed the keys
and then we retrieved the actual
entities we care about.
There's no reason you can't
do some filtering up front.
If there's something
that you know about your keys
that you can filter them
further,
you can do that filtering
way in advance
and, you know,
filter at the key level
before you even retrieve
the--the entities.
So that--that's another way
of optimizing this.
But...it's really--
it's really best
for membership tests,
Venn diagram tests
where the sorting is done
after the fact.
Okay.
So we're gonna do
a little bit of wrap-up
and then we're gonna
get to some questions.
So I encourage you
to use list properties,
use merge-join
for many things.
I believe that this can solve
fan-out for your application.
It'll let you scale
fan-out problems very well,
especially when combined
with our Task Queue API.
This'll let you,
you know,
have a large set of entities
that match a certain index
without a problem.
You can use this same idea
to match geospacial info.
So instead of a list
of my friends,
I could have a list
of coordinates,
or a list
of geographical regions,
or bounding boxes
on a globe,
and do queries on that.
Again, these are
set membership problems.
Now if I add
merge-join to the party,
I can do relationship graphs.
I can do a bunch
of graph operations
that are very interesting.
I could also do
"fuzzy" values.
So if you have any--
any, you know, fuzzy values
that you need to store,
you can do that
with list properties.
For instance,
I can store today's date.
I can store the month,
today's date,
or I can store the year,
the month, the day,
today's date, you know,
the hour, the minute,
the second--
I can store a whole list
of all these different
resolutions of data
and then query
on any one of them
to get a different subsection
of the data.
So a lot of people,
when they do a query on time,
they say,
"Oh, give me everything
between these two times."
And that's not the right way
to think about this.
You need to think about
converting your theories
into set membership tests.
So compute those memberships
at write time
and you'll enjoy
very fast reads.
If you want to find
everything from today,
just mark it
as being from today.
If you want to find
everything from this hour,
when you write the--
when you write
an entity to the data store,
mark it
as being from today
at this hour.
And then all you have to do
to retrieve the data
is a single equality filter.
That's it.
There's no--
There's no inequality filters
required.
If you want to see the demos,
I have put them up available
with source code.
This is hopefully useful.
It takes a little while
to get them up otherwise--
pubsub-test.appspot.com
and
dagpeople.appspot.com.
It's Python code.
And you can do
all these same things in JDO.
And we're working on
further optimizations for JDO,
but it's all
functionally equivalent.
There's more information
on our site:
code.google.com/appengine.
So now let's go
to some questions.
Let me load this up.
[applause]
Thanks.
So yeah, there are mics
here and here
you can come up to.
Yeah, cool.
So I'll just start off
with some of these
and then, if you
have any other questions,
please step up
to the mic.
And if you want to duck out,
thanks for coming.
So...
"What about ReferenceProperty
and ReferenceListProperty?
"Are they as efficient
as StringListProperty?
Can we apply the same practices
you described in your talk?"
Yes, you can.
You can.
The only reason you don't
want to do this, necessarily,
is because ReferenceProperties
include keys.
And keys can be
bulky sometimes.
It has the kind,
the app I.D.,
and the primary I.D. or name
of the entity
in that key.
So in terms
of actual storage space,
you know, a key might be,
you know, 50 bytes
and a single string
might be 10.
So it depends
on how hardcore
you're trying to do
your optimizations.
But, yeah, you can
use ReferenceProperties
if you want
to do it that way.
So...
"How well does
list property scale?
"How many entries
can we put in it
and not suffer
in performance?"
Hopefully
I answered that one.
For writes,
you're gonna pay a cost
to do that serialization,
to put it in the datastore.
But to retrieve it back,
you can avoid
that serialization cost
and overhead.
Yeah.
man:
I wanted to ask...
in merge-join,
about the work clause.
You explained the method,
with the zig-zag...
algorithm works.
So it seems that it matters,
the order of the clauses
inside the work clause
for the zig-zag to work.
And hence it affects
on the performance.
Is that true?
Slatkin: So sorry, could you
repeat that one more time?
man: Does the order of the
statement in the work clause...
Slatkin:
Oh.
man: Affect the performance
of the query?
Slatkin: Right.
That's a good question.
Yeah, so the question,
I guess,
is--does the--yeah.
Does the order
of the equality filters
in the merge-join
affect the performance?
I believe, yes, it does.
So if you--if you know
that you have a certain property
that has a high correlation
to the results set,
you should list that first.
I think that we don't--
we don't have histograms
and we don't have
a query planner to use them,
so any hints
you can give us
on the way to find
your results set is--is useful.
And I think that
setting the equality--
the order of the equality
filters will do that.
I need to confirm that,
but that's
a really good question.
man:
It's not a big difference.
Slatkin:
It's not a big difference?
So, yeah, Ryan implemented it
along with Max,
and he says it doesn't
make a big difference.
I guess BigTable scanning
is very fast
so that could be
a good explanation.
Any other questions?
Yep.
man: So if you have, say,
a list of objects
that each have a price,
and the price is dynamic,
but you want to be able
to search on...
objects that have a price
above a certain value.
Is it possible
to do range queries
without having to build a bunch
of fancy indexes behind them,
because the data
might be changing?
Slatkin:
Yeah, so...yeah.
So that's--yeah,
so, you know,
if you have data
that's changing,
and you don't
want to have to have
indexes for range queries,
how do you do it?
So this is--this is
kind of the--
the answer
to these kinds of questions
usually is
precompute your views
at write time.
So if I have, you know,
an example would be
an auction site.
Show me everything between
the price of 0 and 5
5 and 20,
20 and 100,
100 and 1,000.
You know that those are
going to be your views.
When you update
the price of an item,
add it--add that slice of data
to your list property.
And then you can query
just for set membership.
You don't have to do
a range query.
And if you need
to change your ranges later,
or add more ranges,
you just can write
some relation index entities
to do that,
or you can update the ones
you already have.
And our Task Queue API that
we'll be talking about tomorrow
will greatly increase the--
your ability to do that.
But yeah, I mean,
when you have code
to update that index,
just--just--you know,
re...just, you know,
change--change the way you're
slicing your--your fuzziness.
That's what a fuzzy query is.
That's a really
good example of one.
We got some negative questions.
That's good.
[chuckles]
"Doesn't
the relation index entity
suffer from
the n+1 select problem?"
Who asked this question, and
could you please come to a mic?
I don't know
the n+1 select problem.
Anybody?
You, okay.
Can you tell us--what is
the n+1 select problem?
man: When you select--
when you select the...
the actual number
of elements first,
and then it does the select
for each of the element.
If you go back
to the slide...
Slatkin: Yeah.
man: There was
a "for" loop order.
Slatkin:
Oh.
So...
you're talking about...
relation index...
so this one.
Is this what
you're talking about?
man: Yes, this one.
There's a "for" loop, right?
Slatkin: This loop. That's
what you're talking about?
man: Yep.
Slatkin: So that loop
is just going over
the index result set.
Those entities
have already been...
So this is a list of keys,
these indexes here.
They're not separate queries.
They are not--they don't
require SELECTs.
man: And when you do
the db.get of keys
you actually get
everything at once.
Slatkin: Exactly.
So...so you're doing one SELECT,
one set of BigTable scans--
actually, one BigTable scan
to get all the keys.
And then you're taking
all those keys,
transforming them, and doing
one, like, hash table lookup
to pull
all those values back.
So you don't have--
there's no n+1 problem.
It's just--it's one query,
one get.
Cool.
I have one more...
I think this one's
a little out of scope,
so, yeah, I can answer this
for you later.
This one: "You talked about
avoiding list deserialization
"for--you talked about avoiding
list deserialization for reads.
"What about for writes?
"Can we add a list value
without deserializing
and serializing
the entire list?"
So there's a trade-off here.
You could just write
another message index entity.
That has a lot
of costs associated with it.
So you're paying--
the trade-off--the trade-off is
storage space
or CPU time.
You could write a new entity
with that index value,
a new relation index...
when you need
a new index value.
But then that's
a whole other entity
and it has
all of its associated indexes
and costs of the keys
and so on.
So you need to figure out
that trade-off for yourself.
If you don't really care
about storage space so much,
then that's
a really good solution.
But storage
is usually pretty cheap.
So I could see the trade-off.
But writes are also
very infrequent,
so it could make sense
to update things.
It would be awesome
if we had a way
of just appending
to a list property
without having to retrieve it.
That would be
the ideal solution.
And then we wouldn't
have to do any of this.
You'd just say,
"Oh, yeah,
"append this value
to a string list
"and don't even
come back to me.
I just want you to do it."
And that--
that would be--
that would be one way
to solve this also.
Yeah.
man: Yeah. What makes
a relation index entity
better than
just a join entity,
where you would have
the reference to the message
and the reference
to the user?
Slatkin:
Right, so...
on the join entity--
so yeah.
So--so the question is,
yeah.
What makes a relation index
better than a join entity?
A relation index, effectively,
is a join entity.
That's what it's doing.
It's serving
the same purpose.
The reason that it's useful is
that it's a child of the parent.
So if you saw...oh.
Right here.
The message index
is a child of its parent.
So we can scan for
message indexes very quickly,
get their keys, and then figure
out what their parents are.
We can de-reference
that pointer really quickly
in memory with a minimal amount
of serialization.
If you do this the other way,
with just, like,
one-to-many relationships,
or with queries,
or...
yeah, with a standard, like,
one-to-many relationship style.
Because we don't have joins,
you'd have to first
do a query
for ten items,
and then do a query
for each of those results
to actually
find the references.
And that's--that's
the problem.
So...so, yeah,
MessageIndex
is essentially a join table.
It's a speed-up,
like that.
Yeah.
Dredge: Hi, Ert Dredge
from Universal Metaphor.
Do you have suggestions
when we're doing
our initial data architecture
for helping to implement
these suggestions
that might not be obvious
from the talk itself?
Slatkin: Yeah,
that's a great question.
So are there optimizations
that you should make
when implementing your data
or modeling your data
not captured here?
Well,
there are a few things.
There is a few kind of hidden--
hidden parts of this
that--that we need
to be more clear about.
For instance, how do you compute
the cost of indexes?
And how do you--
like, storage costs.
And how do you compute
the cost of...
also the datastore
storage itself?
We're working on making this
more clear to people
so they can actually kind of
calculate their index--
their--the index size
that their data would have.
And how to figure out
the overhead
of having composite indexes
and list properties.
So that's something
that would be best for you--
when you're modeling
your data, you say,
"Oh, well, if I need
this composite query,
that could explode,"
or that, you know,
"that's gonna require
an extra row,"
and then that has
a large cost.
Another thing
to remember is that
in a standard SQL database,
primary key is usually
just an integer
which has a very small
amount of storage space.
With us,
we also have I.D.s,
numerical I.D.s
that are primary keys,
but our keys, like I said,
encode the full path
to the element.
That includes the app I.D.,
the kind,
any child and parent
relationships are in there,
finally down
to the primary key.
And so, it's useful--
it sounds funny,
but it's useful to actually
use kind names that are short
and to use key names
that are short,
because it reduces
the amount of index space
you have to write,
which is kind of
a funny thing.
Another one
that's kind of similar
is there's a lot of
really interesting things
you can do around encoding.
So one of the ones I know--
I forget the name of it.
If someone remembers it,
shout it out.
There's this type
of encoding
where if you know something's
going to be--
happen really frequently,
you should just--you should
make its representation
be as short as possible.
There we go.
Huffman encoding, yeah.
So if you can do that--
if you know that,
you know,
like, you're gonna
have a celebrity
and they're gonna be using
your microblogging system
all the time,
or you have one node
that's connected to everything,
you want its primary key
to be very short.
Because its--its value
is going to be
in a lot of different lists.
It's going to be
in a lot of relationships.
It's going to have
a lot of index rows.
And you want
to minimize its size.
And so if you can use
that kind of encoding technique
you can really minimize
your storage space a lot.
Cool.
Over here.
man: Is there any way
to do like queries
or full text search?
Slatkin: To do like queries
or full text search.
So, yeah,
so the best answer we have--
there--okay, three answers.
So the first answer
is that we have
a "poor man's"
full text search,
which is the Google App Engine
ext search module
that Ryan Barrett wrote.
That just uses merge-join,
like I've described here,
to do full text queries.
It doesn't have
ranking and ordering,
and it has a bunch
of other problems,
but it's really good
for simple stuff.
And, like I said, it's good
for a small result set.
So if you know
that you're only gonna get
20...20 results,
100 results,
and then can sort them
in memory, it's great.
In terms
of full text search,
then you need something like
a real full text search.
This is, you know...
you need ranking
and you need all kinds
of duplicate elimination
and all kinds of stuff.
This is something that
we know our users want,
we're asked about
all the time.
I don't have
anything to announce
but we know you want it.
You know,
"This is Google.
"They don't have
full text search.
What's going on?"
Yeah, we know.
[laughter]
It's hard, actually.
So...
we're--we're--that's
the best answer I can give you.
I think there's a number three,
but I forgot what it was.
See if there was...
where did that--ah.
Is there anything else...
[indistinct chatter]
Oh, this one's
putting it down.
"One of the reasons for you
giving us list properties
"is to save storage space.
"Since Google is handling
storage, why do we care?"
Billing is enabled.
You can store...yeah.
You can--you can store
gigabytes of data.
It's not just one gig
of free quota.
Quotas--you can get
that quota really high now.
And it's pretty cheap.
So you can store a lot of data
without even knowing it.
So...yeah, you know.
We try to instill
the best practices
in our programming model,
in our APIs.
But we can't always do that,
and so this is kind of
a cautionary tale
to tell you to--
to think about storage
before you--
before you actually
rack up a big bill
or something.
I think
I'm out of questions, and...
so thanks a lot
for coming.
Again, I really appreciate it.
[applause]