>> Welcome, everyone.
It's my pleasure to introduce
Jieming Mao from Princeton University.
Jieming is doing his specialty
there under the advice of Mark Braverman,
and he will tell us about
Algorithms in Strategic or Noisy Environments today.
>> All right.
Thank you for the introduction.
So okay, I'm going to talk about Algorithms in Strategic
or Noisy Environments. All right.
So, I will now first learn algorithms.
I think it as following figure,
so it takes some input and produces some output.
However, in some scenarios,
the algorithms are interacting with agents.
This agents can be very strategic.
They want to maximize their own utility functions.
Sometimes they are not as strategic,
but due to many reason, for example,
limit of knowledge, not spending enough effort,
subjective preferences,
the data they provided might be noisy.
The point is these factors
sometimes can completely change
the solutions of the problems.
So, it's very important to take them into account if we
want to design algorithms for the settings.
I'm going to talk about two projects in this direction.
It's a very general direction.
The first one is we consider the following scenario.
So, there is a seller repeatedly sell goods to a buyer,
and the salary is fully strategic while
the buyer is using
a no-regret learning algorithms to learn over time.
We want to understand what will happen if a buyer
try to use a no-regret learning algorithm
against a fully strategic salary.
It's the first part. And in a second project,
the problem is motivated
by applications like Crowdsourcing.
So we consider a very simple problem
that you want to find.
So, you have a bunch of items and you
want to find the top key of
them by asking people on the Crowdsourcing platform.
You ask them pairwise comparison queries.
These people, they are not that strategic but
at least their answers can be noisy,
and we want to study how to
design algorithms in this setting.
In the end, I talk about
some future directions related to these two problems.
So, let's start with the first problem.
So, we consider the scenario,
bidders participating repeated auctions.
Many of these auctions are not useful,
so it's not clear how to bid.
So, this opens the potential of using
an online learning algorithm to decide how to bid.
There's some evidence by this paper showing that
bidder's behavior is largely
consistent with no-regret learning.
So, we notice that
these standard no-regret learning
algorithms are not designed
to be used against fully strategic salary.
So, we asked the following question.
If the seller knows beforehand that
the bidders have no-regret learning over time,
can the seller extract additional revenue?
So I define what do I mean by additional revenue later.
Quick answer to this question is yes,
even when there's only one buyer
and the seller is selling single item in each round.
So, let me formally define a model.
So, there's a buyer and in each round,
a value is sampled from some fixed distribution.
This distribution is also known to the seller.
And what the seller can do?
The seller can offer some bidding options.
Each option will come up with two parameters.
Y is the allocation probability,
Y is the price.
So, if the bidder beats b,
then the bidder will get item with probability X_t(b),
and the bidder will be charged some price P_t(b).
This parameters have some properties,
they are not very important.
In addition, we require
that there is always a bid cost zero.
Basically, if your bid is zero,
you get the item with zero probability,
and you don't need to pay anything.
So, this is like the buyer can always decide not to
participate in that round.
The buyer, in each round,
will just select a bid,
and the buyer will
learn these two parameters after the round.
So, this is like the bandit setting.
You only learn the parameters of the bid you've chosen.
Our results also work for
the expert setting where
the buyers learns all other parameters, all right?
So, this is the model.
Here is the additional definition.
As we said, the buyer is
no-regret learning over time. So, what does that mean?
So, we define B(v) to be the best B for
some value v. Here,
this term is basically the utility,
so it's value times
the probability you get item minus the price you pay.
So, basically for a value,
we define some bid that is the best
for this value over all rounds,
and the no-regret basic is just that what the buyer gets
is not much worse than using the best bid for each value.
So this is a setting. All right, any questions?
>> So the mechanism is fixed?
>> You mean the seller strategy?
>> Yeah. So it should also changing with time?
>> Yes, so the seller can change the strategy over time.
Also, the seller is not telling the buyer
these two parameters before each round.
It's even okay if you'll tell the buyer
these two parameters because
the buyer is just using a no-regret learning.
But in general, you don't learn
the probability you'll get
item before the auction is finished.
All right. So, okay.
Oops. There are two interpretation of this model.
The first one is that there's
only one buyer and in each round,
the value is drawn from some distribution independently.
The seller knows the distribution but
doesn't know the realized value of the distribution.
This buyer tries to know regardless,
separately for each value.
There is this another interpretation.
So there's a bunch of buyers.
Each of them has a fixed value and in each round,
only one of them participate in the auction.
The seller doesn't know which buyer it comes,
and each of the buyer separately know regardless.
So, you can pick your favorite interpretation.
>> All right.
Let me tell you our results.
So we first show that if the buyer is using
some standard no-regret learning algorithm, like EXP3,
actually, we basically define something called the
mean-based algorithms as no-regret learning algorithm
is in this category,
the optimal seller revenue is the full expected welfare.
Okay. So this is very large.
And if the seller is getting these revenue,
then it means that the buyer's utility
is at most small o(T),
even if the buyer gets all the items.
So basically, this revenue is the best we can hope
for up to some additive small t terms.
Then we show the second theorem.
So there exists a no-regret learning algorithm
tailored for this setting,
such that, actually,
the seller's revenue is not very high.
Okay. So it's equal to this term is basically t times
the optimal revenue you can get by selling the item
for one round using a truthful mechanism, okay.
So here, capital F means CDF function.
And p suffices one minus
F(p) is just the probability that a value is at least
p. So we know that
if you sell the item for
only one round using a truthful mechanism,
the best thing you can do is to sell it at
a fixed price and let
the buyer picks whether to buy or not.
And basically, the optimum revenue
is captured by this term.
>> What is a gain of the buyer?
This is the revenue for the seller?
>> Yes. So, in this work we
focus on the view of the seller.
Basically, theorem two is saying that you're not going to
prove the theorem one for
all the no-regret learning algorithms.
If somehow the buyer is doing
some weird no-regret learning algorithms,
then your revenue won't be very high.
>> But they motivated to use it
if you're doing better than the [inaudible].
Is it doing better in
theorem two than in theorem one, so you're getting-.
>> So, yes. It is definitely better.
>> Whatever it is, what is the buyer utility
in this no-regret learning matter?
>> So for example,
if buyer is using this strategy,
then from the seller's side a,
no matter whether you use,
you won't get a lot of revenue,
you might just use some trivial strategy
that you don't change
your parameters over time.
>> What is the buyer better off here?
Is he motivated to use this algorithm?
>> Does the seller uses a fixed price strategy?
>> Yes.
>> In that case and then buyer
uses this particular algorithm A,
what is the buyer utility in this case?
>> In theorem two, right?
>> Yes.
>> So basically, if the buyer is using this strategy,
the seller will just sell it at a fixed price over time.
So whatever buyer gains in that case,
the buyer gains also same here.
But anyways, in theorem one,
the buyer is getting very little like a small.
>> But in theorem two, he's getting something linear.
>> Yes, getting something like non-trivial.
So you can at least avoid
the awkward situation in theorem one. All right.
In the end, we study
the scenario when the buyer is the same as theorem one,
but there's additional constraint,
the buyer never overbids.
So what happened in theorem one is that,
actually the buyer would learn to overbid,
because it looks good in the beginning.
And we think it's not very realistic,
because in many optional format overbid
is never a good option.
Maybe the buyer with just a no-regret runs
over all the options that are
smaller than the value, okay.
In this setting, we characterize
the seller's optimal revenue by a linear program.
And it's basically between
the terms of theorem one and theorem two.
>> What you mean by simultaneous for
some distributions it's unwantedly
better than this and for some other distributions.
>>So there exists a distribution it's unbounded,
like also unbounded means.
>> Then you mean linear?
>> The ratio depends on the distribution.
So you won't guess I'm causing the approximation that's
not dependent on the distribution's description.
So it depends on.
>> But there are two different distributions, right?
>> There exists one distribution.
>> One distribution.
>> Yes. All right.
You will see here. So it's
basically the equal revenue curve.
>> You theorem three say that if you never orbits,
theorem two will not like I
don't pass on theorem one. Theorem one.
>> So, it didn't have all of these,
theorem one won't happen,
but the bidder is still using a mean-based algorithm.
Yes. So you can still get some revenue better than
the trivial one. All right.
>> How should I understand all of
these theorems together because you said in
the motivation that buyers
will use some no-regret algorithms.
>> Yes.
>> So they the use the one in
theorem two or one in theorem one?
>> I will think like if the buyer,
since the buyer is using a no-regret algorithm,
probably the buyer will simply use
something like theorem one or maybe theorem two.
But theorem two is for like illustrating
that you won't prove
this thing for other no-regret algorithms.
Because once a buyer tries to reason about the strategy,
maybe the buyer will do something even more smart.
It's not clear if the buyer will still use
a no-regret learning algorithm or not.
Maybe the buyer will use this one,
because no-regret learning algorithm works well
if the seller strategy is fixed over rounds.
So maybe the buyer will use this, yes, not exactly.
>> I should use.
>> Actually, if the seller strategy is fixed over time,
the buyer should use a no-regret learning algorithm.
>> So what you're saying is, what
no-regret learning algorithm matters?
Which one matters?
Because not all buyers are the same.
>> And so, the point is that if
the buyer doesn't think too much about the scenario,
just pick a no-regret algorithm when he use it,
then maybe it will result in something bad.
And if the buyer thinks a little bit more than,
at least the seller won't get good revenue,
and the buyer will guarantee something, all right.
So the plan of the talk is to show you
an example to illustrate these three theorems.
So it will be a specific distribution.
It's very simple, there's three values.
It's one quarter which is probably one half,
one half which is probably one quarter,
and one which is probably one quarter.
So the optimal one-shot revenue
in this case is one quarter.
For example, if you sell at price half,
the buyer will pay with probability half,
so you will get the revenue one quarter,
and the expected value
of this distribution is one half.
There are some gap between these two values.
So first as a sanity check,
I want to show that there exists a very simple strategy
for the seller to guarantee at least a one shot revenue.
So at least one quarter in this case.
So basically, the seller just set two bidding options.
One is charging the optimal reserve minus epsilon,
so it's like ultimate fixed price
for the one-shot mechanism.
So in this case,
it can be one-quarter,
one half of one, this
minus epsilon because so
there is no problem of tie-breaking.
And if the bidder choose this option,
the bidder always get item,
and there's zero harm.
Okay. So notice that
this strategy doesn't even change over time.
And what will happen to a no-regret buyer.
If the value is larger than this reserve,
the buyer will learn to play option one,
will mostly play option one,
otherwise the buyer will
get a lot of regret of not playing option one.
And similarly, if the value is small and
the reserve the buyer will
learn to play option two.
So in this case,
the seller can at least get this one-shot revenue.
So actually, if you think about it more,
if the seller use a fixed strategy over time,
then this is the best revenue you can get, okay.
So it seems that there is no problem of using
a no-regret algorithm no matter what algorithm
you use if the seller doesn't change strategy over time.
Okay. Now, I'm going to show you how to
get more revenue from
a mean-based buyer, okay.
So what does mean-based mean?
So it means that in each round,
if there's option arm that has
average utility over all the previous round
much larger than all the other arms,
then this mean-based algorithm has
to choose that strategy,
choose that option with very high probability.
And in some sense,
if the average utilities- so if there
always exists an arm with
average utility much higher than other arms,
than these mean-based algorithm actually behaves
like always pulling the arm with a largest mean.
So in this talk,
you can just think it as always pulling the arm with
the largest, all right?
And recall we will show that
if the buyer is using this strategy,
then we can get the full welfare.
Okay. So now I'm going to show you the strategy.
This strategy does not get one half,
but at least get something better than
one-quarter times t. Okay.
So there are only two bidding options.
One is the zero bid,
you don't get item pay zero.
The second one is you pick
one and you always get the item,
but in the first one half,
you pay zero, in the second half you pay one.
So this is like in the first half,
it's like a free trial and in the second half,
asked you to pay money.
So let's see what will happen with this strategy.
So, when the value is one-quarter,
it's definitely good to be zero in the first half,
you gain one quarter in each round.
In the second half you lose three-quarters in each round,
but you will still bid for like T over six rounds
because the average utility
of bid one is still better than bid two.
So you will not immediately
switch from bid one to bid two,
you will switch after T over six rounds.
>> I wanted to say something.
So, the arm, when he switches to this charging half,
this has to be announced, right?
>> Yes not announced, that's the point.
So, if you participating a auction, right?
I won't tell you how I change my reserve price.
You will see the result after you bid something.
So that's why the bidder is using no regret learning.
>> We must sense it's the same arm.
So suppose I bid one and I get
the item for free and I did it many times.
Now I bid one and I am charged a half suddenly.
So I don't view this as the same option as bidding one
and getting it for free.
So, you're thinking of this as the same arm and it
has this wonderful weight from T over two steps. But now.
>> The same action right?
Your auction is pretty vulnerable
>> It shouldn't be viewed as the same action.
It's really new.
>> So I guess it's problematic if just
in this round I charge you infinity amount, right?
I only charge you my, charge is bonded.
And you can still learn over time.
So what you're losing one round is not a big deal
and also you can't change the results.
>> No regret, assumption is that
the process is stationary, whereas it's not.
>> Okay.
>> I guess another way
if you can change a price gradually,
maybe that makes you feel better.
>> No.
>> No? Okay.
>> I don't complain about this,
it depends on the horizon P
so is there a trick that we can make
so that it doesn't depend on P?
>> Yeah. In our model
we need to know how long it is and somehow
it depends on the number of rounds or something,
but we just want to illustrate a simple case.
All right. And if
your value is one half you will just bid
one from the beginning to the end because it's
always a better option than bid zero until the very end.
But in the end they are the same.
If your value is one you will also
bid one to the end.
So let's calculate the revenue.
So if your value is one quarter,
you're paying for T over six rounds.
And one-quarter happens with
probability half so you get revenue T over 12.
And you can do the same thing for the other values
and you will get T over three which is
larger than T over four.
So where did the additional revenue come from?
One way to think about it is
that if you look at this bid one,
it's on average it's charging you a half.
So it makes sense that value one value half we choose it.
But because we have this sudden change of price.
Actually value one-quarter is
paying for some number of rounds.
if we charge one half in each round then value
one-quarter won't even participate in this bid.
So that's where we gather additional revenue.
>> So what happens if you assume that the bid is
running one no-regret learning and it's arm.
Where the agents experts are mappings from value to base?
Does this still hold?
We are assuming there's a different no-regret for
each value but suppose it is one regret,
one angulam, and actions
have functions from values to language.
>> So I think what,
you can think this value as
context for contextual bandit.
And when there are not many contexts
one way to run no-regret learning
is to run them separately.
Also the later, the second algorithm I'll show will use
information all round so
maybe you will get that gets close to the idea.
Right.
So, all right.
See the second result.
We show that there exists a tailored learning algorithm
that prevents the seller to get additional revenue.
So what's the algorithm?
It's not very different
from the normal no-regret learning algorithm.
You still rounds no-regret learning over
all the possible bids but also some meta-arms.
So meta-arm Bj means
that what you would bid if your value is Bj.
All right. So, we show the theorem saying that
when buyer uses this algorithm then the optimal revenue
will be small.
So here's some intuition why this will work.
So if you look at the no-regret constraint,
when bidder Vi doesn't regret of using meta-arm Bj,
it means that the buyer with value Vi is
at least as happy as if
they pretend their values was Bj.
In some sense it looks very similar
to incentive compatible constraint.
And what actually happens
in the proof is that we show that if
the seller can get additional revenue
against a tailored buyer in this setting,
then we will get to some reduction and we will get
one-shot truthful mechanism that's also
get additional revenue which is not possible.
So, here the seller cannot get additional revenue.
It's a very vague idea.
Let's also go through the example.
So we will still look at the same seller strategy
but now the buyer is using
this tailored learning algorithm.
So what will happen? For value one-quarter,
the buyer will still bid,
will do the same thing because
actually the strategy doesn't change.
For value one-half the buyer will have three options,
one is bid zero, one is bid one,
and the third is bid as value one-quarter.
So you bid one until
time two T over three and then switch to bid zero.
And what happened is that now for
value one-half this is always the best strategy.
Sometimes have tie with
the bid one but it's not a big deal.
All right.
Let's see and for value one we just upper
bound the revenue by T over two.
So let's see what the revenue will be.
So, still value one-quarter gets revenue T over 12,
but now for value one-half you
only get revenue T over 24.
And if you sum them up it will be T over four.
So basically what happens here is that remember we
use this seller strategy try to
get more revenue from value one-quarter,
but now because you're trying to get
revenue from value one-quarter now value one-half,
we also use the same strategy
so you will lose some revenue in value one-half.
This is a second result.
And finally we're going to see the third one.
So now the buyer uses
some mean-based algorithms but never overbids.
Okay so, Matt come up with some names and quotes.
And, remember this is
the result we get some revenue in between.
So, now I'm going to just show you the optimal strategy.
So, there are actually three bids,
there's another bid, bid zero.
I just ignore it.
If you bid one-quarter,
in the beginning you don't get item, and you pay zero.
And after T over three you get
item and pay one-quarter minus epsilon.
If you bid one-half you always get
item and pay one-half minus epsilon.
So, let's see what will happen in this case.
If your value is one-quarter you would just bid
one-quarter because you don't overbid.
If your value is one-half you will bid
one-half to T over three, wrong.
And then you will switch to bid one-quarter very quickly,
because you don't get any utility,
you get very little utility in the beginning.
And for value one you will just keep
bidding one-half seems to be the best option.
So okay.
So what's the revenue here it's something complicated.
Basically the answer is
that it's something better than one over four,
and it's not even as high as one
over three which we showed in the previous example.
So, you don't get that much revenue
as a normal mean-based buyer.
Basically you cannot expect to get full revenue
because no one overbids.
So, in the beginning whenever you lose
some value you just won't get it back.
So there are some observation
of this strategy this optimal strategy,
so you always pay a bid if you get item.
So, this is like a first price auction,
and only change over time is that
the minimum winning bid will decrease.
So basically it's like running
some first price auction
and we just keep decreasing the minimum winning bid.
And we show that this is optimal strategy.
So, what is a proof like?
The proof basically upper
bound this revenue by some linear program,
and then we show that the value of
this linear program can be achieved by
this specific strategy.
All right.
So finally as I said this we want to understand
characterize this revenue which
is characterized by the linear program,
we call it the MBRev.
So, more concretely we show that when
the value of distribution is supported on one to H,
this is known so the value of
the distribution is at most log times the revenue.
This is known before. We show that
the MBRev is at most log log H times their revenue.
And more importantly we show that there exists
a distribution such that both inequalities are tight,
so you get this separation.
So, basically the gap between these MBRev to value
other revenue it won't be bounded by
constant independent of the distribution.
All right. So to summarize first-half.
In theorem one we show that if
the buyers think to use mean-based learning algorithm,
the seller can extract full welfare.
Then we show that if the buyer is using
some tailored learning algorithms then
the seller cannot get the non-trivial revenue.
And, in the end we study
some more realistic setting that buyer never overbids,
then the revenue will be in between and we
characterize it by some linear program.
So this is first-half.
Now I'm going to talk about a different project.
The problem is very simple,
you want to find the top-k items from a set of n items.
And you're allowed to make
a pairwise comparisons and
these comparisons are noisy.
I'll tell you the noise model later.So,
the problem is motivated
by applications like crowdsourcing,
peer-grading and recommender system.
So, let me give you a more concrete task.
So, I think this is a very
important task on crowdsourcing.
So, you have some data,
possibly you also get them from crowdsourcing platform.
And, you want to find
the good quality ones via crowdsourcing.
So what you will do is you will just ask
crowdsource workers to make pairwise comparisons.
And their answers can be noisy.
All right. So to design
algorithms we want to minimize sample complexity.
This is pretty obvious
if this measure how many comparisons you make.
So, It's pretty obvious because it's directly related to
how much money you need to spend on
the crowdsourcing platform.
There's a less obvious one which
is Round complexity.
So, let me first define.
>> Sir can you please define what is top-k,
in this case, how do you define this?
>> So, yeah. So let's say you have N items and they have
already underlying order. And you don't know the order.
>> Okay.
>> You only have the order.
>> But the users know this oder when they are.
>> The user have some knowledge
maybe it's not very accurate but yeah.
Yeah, I defined in the noisy model so,
you see like what they can do.
But they have an underlying order,
that's fixed but you don't know.
All right.
So, what do I mean by Round complexity?
So, basically each round the algorithm needs
to send all the comparison queries simultaneously.
So, it means that the pair of items you
query cannot depend on some other
queries answer in a single round.
So, you need to send them altogether,
and you can classify
algorithms into many categories by the number of rounds.
I think inactive algorithms has zero round
because you're not even allowed to
choose what comparisons to make.
And for active algorithms
there's two extreme case one is a non-adaptive,
you only have one round or you have no constraint.
So the fully adaptive case you have
no constraint on the number of rounds.
So, back to the question why do we care about
the round complexity or why do we want to minimize it?
So, the observation is that
there are many crowdsource workers online,
and they can work in parallel, right?
So, if you are not asking crazy amount of queries,
then amount of time you need to
spend to finish your crowdsourcing task
is decided by the number of
rounds not the number of samples.
So, we don't want the number of rounds to be too big.
Allright. So, in this work we study the trade-off between
the sample complexity and round
complexity in the presence of noise.
So, it's like trade-off between money and time.
All right.
Now I'm going to define a noise model we
study very simple noise models.
The first one is erasure.
So, each comparison is
erased with some probability.
So it's like the crowdsource worker said,
"Okay I'm not sure about this question."
The other noise model is just called noisy.
So, oh by the way for the erasure case,
if the answer is not
erase then it's always correct.
And in a noisy case also can be
incorrect but each comparison needs to
be correct with probability
at least half plus comma.
Back to your question is like this is what
the crowdsource worker now.
>> But you didn't,
once you have crowdsource is that
just lazy compared to substrate so?
>> Yes but that's a multiprogram.
>> One story though.
You can think of it as just noisy.
But I guess no people like crowdsource,
it's also useful. All right.
>> If you'd like,
it should, can be looked at it.
>> The rounds.
So we looked at the rounds but
the point of view of rounds was not looked at.
>> Yeah.
>> Actually, I am just
about to talk about some related work.
So there are two very related ones.
The first one is Bollobas and Brightwell.
They study exactly the same problem
they we want to find the top-k.
And they study the trade-off
between sample complexity and round complexity.
But, they didn't study the noise.
Maybe they don't get the processing
to motivate the problem.
There is another one that's also very related.
They study top-k and also some other similar problems.
They actually study the case with noise,
but they do not consider the round complexity.
And, yeah at that time they were
motivating the problem by
something like an MBA tournament.
But now we can't talk about protit.
All right. So, now I'm going to talk about the results.
>> What was the motivation of Bollobas and Brightwell?
>> So they basically want to just find top-k.
I don't know what but the inner parallel computation.
>> I think plotting graphs also.
>> They used of graph theory.
That is very interesting.
All right, so here's our result.
First we look at the one-round algorithms.
We showed that they perform pretty badly.
So sometimes you are forced to use one-around
algorithm because time is really important.
For example, if you want to use pairwise comparison
to review conference papers you
probably don't want to do it in multiple round algorithm.
The conference will happen in the 10 years later.
All right, so we show that if you
don't spend much more than linear number of comparisons,
you have to make these many mistakes.
We show algorithm to achieve this bound,
and we also show that they are tight.
Here, in the erasure case,
you get the one over gamma.
In the noisy case,
you get the one over gamma square so
the quick intuition of this thing is that if you-
>> What does these that mistakes mean?
>> Okay. Mistake basically means that
you have something that is not in top-k,
or you did not output something that is in top-k.
>> Yeah, but what is that?
>> One mistake means you
opt to one number that is not in our top-k,
or you want, you did not output a number that's in top-k.
>> Yeah. Are you
saying that if I want to recover top one then?
>> Yeah, sorry.
Yeah, in our work,
we'll assume that k is
like theta of n and also a minus k is also
theta of n. You want
to output maybe top half maybe just for this talk;
but as long as k is theta of n,
so our result only works for that case.
>> I mean I would imagine the other bound,
like I want top 10 result.
>> Okay. So our result also work for-
>> You always can do minimum with
k with for number of reason.
>> Yeah.
>> Okay.
>> Sometimes you want to find the top.
>> Top-y is definitely easier than top-k. For top-k,
if you will want top-k,
you can add some additional items,
and find the top half, right?
If we want to find top-y you
can add n minus y additional items.
Then, you solve the top half,
and you will get the top 1,
but the thing is that we are not getting
the optimal dependency on k. You definitely
can use our algorithm to solve top-k
for small k. It's just maybe it's not good dependents.
>> Actually, that can just give you the meaning of this k
because once the dominant answer
will always be present, maybe.
The ones you really care about,
they are not present so, essentially,
you're getting mean of this of k. It
should be mean less of k.
>> Yeah. This is a lot of mistakes, right?
If k is more dense like pretty meaningless.
>> You compute this so you can both lower bound that,
right, for each of them?
>> Yeah, the lower bound is for like k equals 2 alpha 2.
Yes, it will be interesting to solve the problem for like
an arbitrary key. All right.
This is one round result
and then we study the case for multiple rounds.
We want to get zero mistakes. Okay, in this case.
First of all, is no one before
that you can use there exists
a four round algorithm with linear number of
comparisons that gives you zero mistakes.
Okay? It's first proved by them.
In this work, we get a simpler algorithm.
They definitely first prove it.
They also show that it's not
possible to do it in three rounds.
Alright, so four round doesn't seem to be too bad.
All right. Then, in the erasure case,
we directly get the four round algorithm with
L log over gamma number of comparisons. What do you do?
You just use this four round algorithm and
you repeat each query by logging over gamma times.
Then you know that,
with high probability, each pair is not erased.
Each pair has at least one now erased a comparison.
>> You're right. In multiple rounds,
you can query the same guy.
>> Yes.
>> I see. Thank you.
>> Within the same-
>> Oh, I see. Okay.
>> Yeah.
I guess most of the results can be turned into
some case that you don't query the same pair.
but yeah. By information theory is very easy to say
that you at least need L over
gamma number of comparisons.
They say because erasure comparison
only conveys gamma information.
It's not clear that whether you need this login or not.
The answer is actually you don't need this login.
If you spend a little bit more around,
you can remove the login in the number of comparisons.
Here log star is the iterated logarithm function.
It doesn't seem to be many rounds
and we show that if you really
want the n over
gamma number comparisons, you need these manual.
Okay? Finally, for the noisy case,
for the same reason you
can get a good four round algorithm.
>> If I have really constant four or five rounds,
do you know what is the truth there?
Do you need to login or not?
>> Yeah, I don't know.
Five round probably occasion.
>> Of course, five is bigger than
looks there in all in you'll ever
encounter but this is theory
so if you have a fixed number of rounds four or five.
Do you need a login over gamma or you don't know?
>> You can't do enough gamma.
>> So this is the same.
>> Yeah. >> This is a theta.
>> What? No.
>> There's theta.
>> There's no theta.
>> There's theta.
>> If you want the L gamma, need log star.
>> Yes.
If you really want more gamma,
you need the log star;
but if you want something in
between that I don't know what that's like.
>> For L over gamma, you really need.
>> Log star.
>> Log star. There's a lower bound.
>> Okay. The theta is a lower number out there.
>> Our final n log star n over n might be possible.
>> Five rounds.
>> I'm just switching these two.
>> Yeah, maybe, yeah.
>> Sometimes I feel log star is not
very large but realistic settings, but all right?
Okay. Back to the noisy case,
we show that same star it does not happen.
Even fully adaptive algorithms
need to have this log in factor here.
Okay? Basically, the noisy case,
you'd better just use this four rounds algorithm.
All right. Okay? These are the results
and compare it with a one round algorithm.
If the one round algorithm
use a similar number of comparisons,
then that algorithm needs to make a lot of mistakes.
This is omega of L,
n over log in.
Maybe this is not too bad. All right.
Good My plan is to show you one around algorithm to
give you some flavor of like how this algorithm works.
It would be a very simple case.
You want to find the top half.
It's even in the noiseless case.
What do you do? The first idea is that if you
want to find the top half we should
just compare everything to the median;
but actually, since we allow some mistakes,
you probably just want to compare
it to some approximate median.
Okay. This seems not hard to do if we have two rounds.
What do we do?
>> Why is this the case,
are we just sorting?
>> Sometimes, you just want to select
the data with good quality
or you just want to select paper with good quality.
Yes, sorting is also an interesting problem,
but it seems this-
>> Because for all these questions,
you should ask for sorting and I would ask.
>> Yeah, you should. Yeah.
>> But sorting, it could be harder.
>> It could be harder, yeah.
>> I think sorting somehow is easier to prove
lower bound yearly especially number parts.
>> You get more than noiseless but then,
it doesn't stay and look at nicely.
>> Yeah, it just powers the sorting.
>> So you could just do every comparison
until you're sure of it, so.
>> Yeah, but then they're unequipped.
>> Yes, so sorting is also interesting here but somehow
it turns out to first study so that's a problem.
>> Usually, you need the local ware
because to the sorting, yeah, sure.
>> Then, MAT they follow up paper on sorting?
>> MAT.
>> Yeah, I think that was the stat paper.
They actually started sorting the similar one.
>> I don't know.
>> Yeah.
>> This was the stat paper, it was MAT.
>> Okay, so maybe the next time.
>> Anyways. All right.
How do we find the approximate median?
Sorry. How do we find the top half?
If we have to round, it will be easy.
We first randomly pick
a skeleton set instead size square root
n and we compare each pair in
this skeleton set in one round okay.
After this round, we will just
look at the median of this skeleton set.
By some concentration argument,
this median will rank close to the actual median.
In the next round, we can just compare
everything to this element and then we
will solve that top half with some mistakes,
small, in to the three quarters.
Now, the question is how do we
make all these things in one round.
Here's the algorithm.
We still pick a skeleton set of the size square root of
n and we still compare
each pair of the elements in the skeleton set;
but in the same round,
we also, for each element that's not in the set,
we compare to d minus one random elements in
S. The reason we do this is because we
don't know which one is the median of this skeleton set
so the best thing to do is compare some random elements.
How do we output?
Basically, it's similar to the two-round algorithm.
If we know that for
any element that's outside the skeleton set,
if we know that it beats a median of the skeleton set,
we will say, "Okay.
It's in the top half." If we know it's not,
then we will say it's in the bottom half.
Otherwise, we just do some arbitrary answer.
It doesn't really matter. All right.
How does proof works?
It's actually simple.
>> One question. Your only one resource,
is there's a confidence on
the actual ranking like
the first element will never go out.
Like, for example, I'm getting not that talk.
>> We consider them like equally important.
>> Yeah, because if I think this
is my favorite evaluation then,
like use in a conference,
you don't want to the first answer to
be go out nowhere to wait.
>> Yeah, not all mistakes are equal but yeah,
the algorithm actually only makes
mistakes between close to the middle;
but we never tried to make a formula.
I guess, actually,
the algorithm probably won't make mistakes on
the first guy but we did not try to prove something.
>> But even the algorithm really sort
of not taking into this account, right?
>> Back to this one.
The mistake only happens in the middle, right?
The first one will always be positioned correct
but we did not try to do it more carefully.
It's like for conference paper review,
the borderline paper is really messed up.
>> So that actually makes sense there.
>> Yeah.
>> All right.
So back to the analysis.
Basically, if element i
is compared to some element j that's
in the skeleton set and the j
is between i and the median of the skeleton set,
then we will later figure out that i is better than x.
Basically, we just try to bound our error probability by
the probability that such j doesn't exist.
Then, if we sum up this term,
you don't need to parse it,
you will get n over d. It's actually pretty simple.
All right.
>> The number of ones that are based from,
in this example, is as important?
>> This one is like n over
d. If d is a constant, it's like linear.
The previous example is for two rounds,
you get n to the three quarters.
Actually, you can make it even down to n
to the one-half by a more complicated strategy;
but for only one round,
if d is a constant, you get a linear number of mistakes.
We can also show a lower bound,
but it's different from the algorithm.
All right. Here's some ideas
in the noisy case so it becomes a harder problem,
you don't necessarily learn the meaning
of the skeleton set
and some useful comparisons will be erased off
late and we need to
more complicated strategies but
I don't have time to talk for them.
All right? With some future directions,
for the first project direct often problem
is that auto study,
what happened if you'll sell
to multiple no-regret buyers?
One hard thing there is
that now you have this competition
so it's not clear how to allocate items in a good way.
It becomes much harder than
the case when there's only one buyer and also it's
hard to study when they're
multiple no-regret algorithms are
running at the same time.
They have a weird interactions.
We study some similar problems.
Some problem in a similar flavor so
this problem is like MAT.
We consider a principal running
a multi-arm bandit algorithm
but the arms are strategic agents.
Each arm, if get put,
will get some private value and
arm can decide how much to give to the principal,
and the arm tried to maximize
the total sum of these values that keep for themselves.
In our work, we show that in certain settings you only
get very bad negative results.
Basically, the principal cannot get a lot of value;
but it will be interesting to get
some positive results in
some reasonable setting and more.
In general, we're very interested in the direction of
learning where strategic agents. They're small.
They're more successful stories if
the strategic agent only
participating one round of the game.
Usually, it's easy to reason about
because the problem really
is just a two level structure thing.
The problem becomes harder to reason about
if the Strategic agent participating
many rounds with adaptive strategies. All right.
For the second project,
one related problem is to study
some similar problems like
sorting approximate top-k.
Actually, very interesting approximately top-k seems
very close to top-k. Now, yeah,
instead of outputting k elements that are in
top-k you are allowed to alpha k l elements
in top l where l is slightly larger than
k. We want to understand
whether the problem becomes much easy or not.
Okay? There are also some other problems like you study
more complicated noise models that
are more close to human's behavior.
All right. All right. Thanks.
>> Thanks for your time.
Well, let's take more questions offline.
No comments:
Post a Comment