Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
It's well described though. Main tricky thing is "normalized divisor". But put it aside for a moment. Lets say the problem is hard. So, lets make it easier. Easier problem is when you want to divide number N words (limbs in terms of gmp) by 1 word (limb). It's easy. It's reduced to division of 2 word by 1 in succession.
 XXXXXXX
-ZZ
You find such number Y so d*Y = ZZ and remainder is less than Y, which means one digit of dividend is turned into zero after subtraction. So, problem reduced into N-1 words division by 1 word. Here where comes "normalized divisor". What if divisor is just one = 1? Then quotient is two words! For divisor 1 it doesn't make carries when you add quotient to ending result, but if you have divisor small, it may cause carry. Also, you may not have division that may output quotient 2 words size. That's why we normalize divisor, by shift of both: dividend and divisor. Because A/B is equal (A*2)/(B*2). Take care with remainder though. Alright. Now we can divide arbitrary long number by single word. What if divisor is also arbitrary long? Well... reduce to simpler task: to division by single word again. What if we neglect all words (limbs) except most significant? Lets say D is original divisor, and D' is D with all zero words except most significant:
DDDDD = D
D0000 = D'
Then D > D', and X / D < X / D' (the higher the divisor, the lower the result). Now, idea is similar. Lets divide X by D':
 XXXXXXX
-ZZ
We need to find such y that y*D' = ZZ is less than XX and remainder XX - ZZ is less than D'. You can do that by simple division of two words by one. Now, instead of simple subtraction from X. We'll calculate y*D - multiplication of 'guessed' y by real D - with neglected words (limbs) turned back. This is single word times arbitrary long number multiplication. 1xM multiplication in terms of gmp (where M = number of words (limbs) of D). Now, because X / D < X / D' result of y * D may be larger than we need. In this case we need to reduce y by one. And instead of additional multiplication, we just subtract D because: (y-1) * D = y * D - D. Taking into account that we want to subtract result from X, you could subtract y * D and if it turned into negative, you could add D back. This is what is described in quote:
Such a quotient is sometimes one too big, requiring an addback of the divisor, but that happens rarely.
Now, you know highest word (limb) of quotient. And current X has current remainder. Repeat this process until X is less than D, and you're done. And... why don't you just use gmp for example? Or other package that is made for things like that? :o Regarding thing with probability... I'm not specialist there. I see only two ways. Numerical approach and Analytical approach. Numerical approach it's monte carlo or any other numerical. For example, you may integrate using various numerical methods, like Runge–Kutta, Euler method, or Gaussian quadrature... Analytical approach is harder just because you have to scramble formulas. There are some packages for it, like in numpy for Python... but it's way complicated. I'm horrified from it. If you don't need to program stuff but just solve exact task like you noted, then you indeed may find distribution first, then apply suitable methods. Not for any integral of analytical funciton is expressed with elementary functions.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
r57shell wrote:
Easier problem is when you want to divide number N words (limbs in terms of gmp) by 1 word (limb). It's easy. It's reduced to division of 2 word by 1 in succession.
Yes, the so-called "short division" method. It's actually fascinatingly simple to do in x86 asm, since the div instruction seems to have been designed precisely for this purpose. It can be described in pseudocode as:
    1. index = 0 2. R = 0 3. A = dividend[index] 4. Divide the double-word value RA by the (one-word) divisor and put the quotient in result[index] and remainder in R. 5. Increment index and if it's less than the size of the dividend, jump to step 3. 6. When this finishes, the remainder of the entire division will be in R.
(It actually took me some time to figure out the simplicity of it, because basically every single description of "short division" out there is needlessly complicated. Distilling the algorithm into its simplest, and most efficient, form required a bit of thinking about it.)
And... why don't you just use gmp for example? Or other package that is made for things like that? :o
Hobbies are not always done for practicality, but because of personal interest and learning. (In fact, I have learned quite a lot of interesting and even useful things during this project.) That being said, it's not out of the realm of possibility that this project might end up being a viable alternative to what already exists out there. Looking at multi-precision libraries out there, the vast majority of them can be roughly divided into two types: The absolutely massive ones that support and do everything between the heaven and the earth (and which consist of millions of source files, often held hostage by a unix-only configure script, making it extremely hard to use in other environments, or for targets not supported by that script), and the very simplistic and inefficient ones that seem to have been done with the principle of "as long as it works" (and "efficiency is not even the goal here, if you want something efficient, use GMP"). From what I see, there seems to be a gap in the "market" that offers the best of both worlds: A very simple and straightforward-to-use library (consisting of just a couple of files that require no configuration) that still offers very decent and competitive efficiency. I'm not saying my project will fill this gap, but hey, who knows. I will have to study your description to see if I can understand how the algorithm works.
Joined: 7/13/2014
Posts: 36
OmnipotentEntity wrote:
I'm looking into some PPL (probabilistic programming language) stuff, and as part of that I'm trying to come up with approximations of the results of various functions applied to distributions. For instance, let's say I want to apply the sigmoid function 1/(1+e-x) to a normally distributed random variable. We can solve for the cdf of this distribution by feeding the inverse of the sigmoid through the cdf of a normal distribution. From this we wind up with a reasonably complicated distribution on (0, 1). I'd like to estimate this distribution with a beta distribution, but I have no idea how to go about it. Although I have an expression of the pdf of the resulting distribution, I cannot even calculate the expectation value of it. Outside of SVI or Monte Carlo methods is there a way to calculate this?
If I'm not mistaken, what you're describing is a logit-normal distribution, which unfortunately has no closed-form solutions for any of its moments. StackExchange has further practical discussion here: https://math.stackexchange.com/questions/207861/expected-value-of-applying-the-sigmoid-function-to-a-normal-distribution. For arbitrary functions of random variables, the Delta Method can be helpful for estimating moments and describing their asymptotic distributions.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
r57shell wrote:
Alright. Now we can divide arbitrary long number by single word. What if divisor is also arbitrary long? Well... reduce to simpler task: to division by single word again. What if we neglect all words (limbs) except most significant? Lets say D is original divisor, and D' is D with all zero words except most significant:
DDDDD = D
D0000 = D'
Then D > D', and X / D < X / D' (the higher the divisor, the lower the result). Now, idea is similar. Lets divide X by D':
 XXXXXXX
-ZZ
We need to find such y that y*D' = ZZ is less than XX and remainder XX - ZZ is less than D'. You can do that by simple division of two words by one. Now, instead of simple subtraction from X. We'll calculate y*D - multiplication of 'guessed' y by real D - with neglected words (limbs) turned back. This is single word times arbitrary long number multiplication. 1xM multiplication in terms of gmp (where M = number of words (limbs) of D). Now, because X / D < X / D' result of y * D may be larger than we need. In this case we need to reduce y by one. And instead of additional multiplication, we just subtract D because: (y-1) * D = y * D - D. Taking into account that we want to subtract result from X, you could subtract y * D and if it turned into negative, you could add D back. This is what is described in quote:
Such a quotient is sometimes one too big, requiring an addback of the divisor, but that happens rarely.
Now, you know highest word (limb) of quotient. And current X has current remainder. Repeat this process until X is less than D, and you're done.
I really hate to sound so rude, but after reading this approximately 20 times, I still can't make head or tails of it... I don't even understand what this means:
Lets divide X by D':
 XXXXXXX
-ZZ
We need to find such y that y*D' = ZZ is less than XX and remainder XX - ZZ is less than D'.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
4EDCDD = X
   328 = Y
*2
9DB9BA = X*2
   650 = Y*2
*2
13B7374 = X*4
    CA0 = Y*4 = D

D' = C00

13 / C = 1
1 * CA0 = CA0

13B7374 = X*4
-CA0    = D*1000
0717374 = X*4 - D*1000

71 / C = 9
9 * CA0 = 71A0

0717374 = X * 4 - D * 1000
-71A0 = D*900
negative -> 8
so 71A0 - CA0 = 6500
0717374 = X * 4 - D * 1000
-6500   = D * 800
00C7374 = X * 4 - D * 1800

0717374 = X * 4 - D * 1000
-6500   = D * 800
00C7374 = X * 4 - D * 1800

C7 / C = 10
10 * CA0 = CA00

00C7374 = X * 4 - D * 1800
 -CA00 = D * 100
negative -> F
CA00 - CA0 = BD60
00C7374
 -BD60 = D * F0
0009D74 = X * 4 - D * 18F0

9D / C = D
D * CA0 = A420
0009D74 = X * 4 - D * 18F0
  -A420
negative -> C
A420 - CA0 = 9780
0009D74 = X * 4 - D * 18F0
  -9780
00005F4 = X * 4 - D * 18FC

-- results --
18FC * CA0 + 5F4 = 13B7374
18FC * (328*4) + 5F4 = (4EDCDD * 4)
5F4 / 4 = 17D
18FC * 328 + 17D = 4EDCDD
Player (36)
Joined: 9/11/2004
Posts: 2631
I'm having difficulty finding a satisfactory answer to a question I came up with, and I'm wondering if you guys would be of help. So we've all heard about the 12 ball problem (right?), where you have 12 balls, one is the odd ball, it might be heavier or lighter, and your job is to find the odd ball and tell me if it's heavier or lighter, and you only have a balance scale that can indicate if the two arms are balanced or if they're not which tray is heavier. There's a reasonably good information theory way to analyze this result. There are 24 possible states, so the total entropy is log_2(24) = 4.585 or so bits. Because the balance gives three possibilities, the best possible method would be to maximize the entropy gain at each step, which you can roughly do by arranging the balls to give you 1/3, 1/3, 1/3 probability (or as close as possible) with each weighing. This gives a lower bound of log_3(24) weighings = 2.893. So best case scenario you would be able determine the unique case out of 24 with only 3 weighings, and in fact, you can achieve this. What I'm trying to find is an analogous version of the problem where there are 2 odd balls out of 18. The odd balls can be any weight, they do not have to be the same. They can be the same or different. So there are 7 total distinguishable cases: both heavier same magnitude, both heavier different, (same with lighter), and then one heavier one lighter x3 (H bigger, L bigger, same). However, if you're selecting from these 7 cases to simulate, because order matters in case where the balls are not identical, there are 12 total ways to choose these two balls. So the total possible beginning states is 12 * 18C2 = 10.842 bits of entropy. The best we can do is the same 1/3, 1/3, 1/3, but now hidden state matters a lot. In theory this should be possible in 7 weighings because log_3(12 * 18C2) = 6.841 or so It should be noted that log_3(12 * 19C2) = 6.942 or so, which means that it might be possible to do 19 balls in 7 weighings as well, but I went with 18 because something similar happens in the one ball case between 12 and 13, it has to do with not being able to get perfect 1/3, 1/3, 1/3 probabilities. So I went with 18 because of that and because it has a nice factorization for this problem. Is 7 weighings actually achievable in this problem? If so, what would be the strategy? And then for me: what's the best way to keep (non-redundant) track of the probabilities of the various hidden states? I tried just keeping them in a matrix, but a lot of the possibilities seem to overlap, and I was constantly finding myself double counting things.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
I don't think there is any strict relation between entropy and strategy for games like that when you can "tune" strategy on the way to the end. For 12 balls and light/heavy odd ball there are already many subcases so it's already very tricky task. When you add second ball there are some things to consider. Do you allow strict opposite difference? for example, normal wieght is 5 and two odd balls are 4 and 6. so 4+6 = 5+5? For single ball there is simpler version, when you know in what way it's odd: heavier or lighter. Then task is easier. So, you may try to solve 12 balls 2 heavier balls first, and find out smallest possible number of steps. But, what comes to my mind: try original problem with 11 balls instead. Entropy should be lower, but I expect more steps. Reason is obvious, I don't know anology for first step of solution for 12 in case with 11 :p Edit: Ah, no, it works. I'm retarded. But 13 balls is log326 weighings = 2.966 I don't think 13 balls is possible in 3 steps.
OmnipotentEntity wrote:
So there are 7 total distinguishable cases: both heavier same magnitude, both heavier different, (same with lighter), and then one heavier one lighter x3 (H bigger, L bigger, same). However, if you're selecting from these 7 cases to simulate, because order matters in case where the balls are not identical, there are 12 total ways to choose these two balls. So the total possible beginning states is 12 * 18C2 = 10.842 bits of entropy.
No, you have a mistake. 8*С212 = 9.044 bits of entropy. Reasoning is following. First, choose an answer - places with odd balls. Variants with different answers are obviously different. So, now, the only reason why variants can be different with same answer is their weight. There are four possibilities: LL*3, LH, HL, HH*3. LL and HH multiplied by 3 because LL or HH can be < = > in their weight (only because in your task you may have L and L different). Order matters, because when we know answer, and pick two places in ordered way, we can place all eight those possibilities, and this would give different beginning states. I can also say, that theoretically you don't need to know are they heavier or lighter to give an answer, so theoretically, you could delete this 8 multiplier and 2 multiplier in original task! This is another reason why I don't think there is strict relation with entropy.
Player (36)
Joined: 9/11/2004
Posts: 2631
Do you allow strict opposite difference?
Yes, I called that out specifically as one of the 7 total distinguishable cases.
No, you have a mistake. 8*С212 = 9.044 bits of entropy. Reasoning is following. First, choose an answer - places with odd balls. Variants with different answers are obviously different. So, now, the only reason why variants can be different with same answer is their weight. There are four possibilities: LL*3, LH, HL, HH*3. LL and HH multiplied by 3 because LL or HH can be < = > in their weight (only because in your task you may have L and L different). Order matters, because when we know answer, and pick two places in ordered way, we can place all eight those possibilities, and this would give different beginning states.
I think you misunderstand what I'm considering the end state. Because there are 7 total distinguishable cases with 12 ways of making those 7 cases, I'm considering the end state to be when each ball is reduced to one viable hypothesis. Not just when I decide their relationship solely to the normal balls. Also I believe you may have entered the data in your calculator wrong? It's still 18Choose2, not 12Choose2. Anyway, here is my progress on it from last night. Each ball can be represented as a set of possible hypotheses. A hypothesis takes into account the state of all balls, because tracking only the identity of specific balls is not really feasible considering with two odd balls there are mutual dependencies. I've managed to generate a complete list of possible hypotheses, which matches my original calculation. Then write a partitioning function that creates "experiments" from groups of (hypothetically identicals) balls. Which... isn't the most efficient... (Does anyone know a really fast algorithm for partitioning colored balls into colored bins, where every bin of the same color needs the same number of balls?) Next, I'm going to set up a system for generating a graph, and perform an A* search on it using the entropy remaining as a heuristic function, and I'll see what shakes out.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
OmnipotentEntity wrote:
Do you allow strict opposite difference?
Yes, I called that out specifically as one of the 7 total distinguishable cases.
I didn't want to allow it, because why don't allow then 3 balls equal 2 balls, like 2 2 1 and 2 3? Probably it doesn't matter though. And you see here they are strict opposite.
OmnipotentEntity wrote:
I think you misunderstand what I'm considering the end state. Because there are 7 total distinguishable cases with 12 ways of making those 7 cases, I'm considering the end state to be when each ball is reduced to one viable hypothesis. Not just when I decide their relationship solely to the normal balls.
Okay, as far as I understand now, you mean answer is also telling their oddness, do they both lighter and equal between, etc.
OmnipotentEntity wrote:
Also I believe you may have entered the data in your calculator wrong? It's still 18Choose2, not 12Choose2.
Well, I use interactive python:
>>> log(12*18*17/2,2)
10.842350343413809
>>> log(8*12*11/2,2)
9.044394119358454
One above is your calculation, below is mine. So, everything is fine. If you mean that I should put choose from 18 - no, with my reasoning I should choose from 12. And, yes, with additional case with strict opposite, multiplier would be 12.
OmnipotentEntity wrote:
Next, I'm going to set up a system for generating a graph, and perform an A* search on it using the entropy remaining as a heuristic function, and I'll see what shakes out.
I think the real trouble will be number of intersection of hypothesis.
HHS
Active player (286)
Joined: 10/8/2006
Posts: 356
In a recent youtube video, the uploader presents a problem where you have 1000 drinks, 100 of which are poisoned, and a machine that can analyze a sample and tell if there's poison in it or not. The question is, what is the most efficient way (in terms of the number of measurements) to determine which drinks are poisoned? He then goes on to present "the solution", which consists of dividing the drinks into groups of 4 and measuring each group. Then, he reasons, because there is a 0,6561 chance of any of these groups of 4 being okay, we are left with 86 groups, in other words 344 drinks, to analyze one by one. We therefore need to make 594 measurements. He then proudly shows some graphs and mentions the word "optimal" a lot. What is the actual optimal solution, though? For the average case, I estimate that by measuring groups of 7, then groups of 3, then doing them one by one, you'll make roughly 560 measurements. If we want to minimize the worst case, the best I could come up with was doing groups of 3, and then doing them one by one, for 631 measurements in total. (Special case: if you end up with 100 positives in the first step, you'll at most need to make 200 additional measurements.) How close can we get to the theoretical minimum of 469?
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
HHS wrote:
If we want to minimize the worst case, the best I could come up with was doing groups of 3, and then doing them one by one, for 631 measurements in total. (Special case: if you end up with 100 positives in the first step, you'll at most need to make 200 additional measurements.)
For groups of 3 there is following thing: lets assume we have 100 positives. Then, we know that in each group should be exactly one poisoned drink. So, if we test one by one within group, then as soon as we find poisoned one move onto next group. So worst case: we try two of drinks and none of them is poisoned, and we know for sure that last one is poisoned. Therefore in this case 2 tests is enough for group. So after we have 100 positives, we can solve with 200 tests. Tricky part is how we find 100 positives. If we have already 99 positives, there is chance that we have 99 positives, or 100 positives, so we need to dicide it. In good case we have 99 positives and many groups left: then we can do binary search! But in worst case we have 99 positives when we have left exactly one group that may have poisoned or not. So we need to test all groups in worst case. Assuming that there are 332 groups of 3 and 1 group of 4, then we need 333 tests to find out number of positives. In case with 100 positives we can solve in 333+200 = 533 tests in total. And, it's not worst case. The worst case is when positives are 99. This means that one of group has two poisoned dinks, and we don't know which. For each group there is a chance we have two in it. If after two tests within group we have no positives, it means last one is positive - only two tests. But there is case a bit worse: if we have second test positive, we don't know do we have two or one positive. So strategy is to check third one. In worst scenario we have all groups no yes no answers to poison test, and in the last one we don't need to do last test because we know that last group has two positives. And last group has 4 drinks, so in total, in this case we can solve in 333 + 300 - 1 = 632 tests. Now, things go even more complicated. We know that we should do 2 test at least for each group, then why not do them in the first place, and dicide stuff later? For last group we will do three tests. After 201 tests we have three kinds of groups: 0,1,2 poisoned drink found. For groups with 0 we know poisioned drink. If we have group with 2 poisoned drink found - no additional tests is need. So, we have left with only groups with 1 poisoned drink found. And our goal to find where is last one of two. We can do it with binary search. So we can solve using this strategy this case in 333 + 201 + 9 = 543 steps, where 9 is smallest power of two greater than 333. This leads to question: is it worst case? What if we have 98 positives? this may lead to 2 groups of two! How to deal with it? You can do similarly with groups of 4. In short: things are complicated. I don't know optimal solution.
HHS wrote:
How close can we get to the theoretical minimum of 469?
I have no idea. Theoretical minimum is 465 https://www.wolframalpha.com/input/?i=log%282%2CC%281000%2C100%29%29
Player (36)
Joined: 9/11/2004
Posts: 2631
I do believe the forum is on an information theory kick recently. Or maybe it's just me. To simplify the problem a little bit. We can not constrain the total number of poisoned drinks to 100. Any drink has a 10% probability of being poisoned. Formally, this means it's a Bernoulli process with probability 10%, so the total number of poisoned drinks is 100 +/- 9.5 or so. We're going to attempt to maximize the entropy of each test. H(x) = - sum_i P(x_i) log_2 P(x_i) Where P is the probability that the test comes up true or false. The probability that none of the drinks are poisoned is (1-p)^n where n is the number of drinks tested. And the probability that at least one is poisoned is 1-that. So our entropy will be H(x) = - ((1-p)^n log (1-p)^n + (1 - (1-p)^n) log (1 - (1-p)^n)) for the simplest case. However, if a sample group comes back poisoned, we can use Bayesian reasoning to update the probability that each is poisoned. p(poisoned | group failed) = p(group failed | poisoned) p(poisoned) / p(group failed) = 1 * p / (1 - (1 - p)^n) Now we have individual drinks having different probabilities of being poisoned. But that's ok, we can update our probability that any particular group is poisoned by treating each drink individually: All clean = prod_i (1 - p_i) At least one poisoned = 1 - All clean So now we have all of the parts required, an algorithm to use this information might be: 1. Select a group of drinks to test, such that the group selected will maximize the entropy. 2. Test the drinks and update the priors of the drinks. 3. Go back to 1 and repeat until all drinks are firmly determined That being said, this sort of greedy optimization strategy might not actually be completely optimal, but it's likely to be close. Here's the entropy values for the first few possible tests: 1: 0.4689955935892811 2: 0.7014714598838974 3: 0.8428959163008505 4: 0.9285019333230815 5: 0.9762424580788389 6: 0.9971458039189479 7: 0.9986404833379545 8: 0.9860043741939146 9: 0.9631147077856703 10: 0.9328822493001994 We we would want to choose to test 7 drinks for our first test. If the test passes, we know all 7 drinks are clean. If it fails, then we update the probability that each of those 7 drinks is bad to be p / (1-(1-p)^7) = 0.191... Then we go back to step one. We choose the combination of drinks such that entropy is maximized. Let's say we're at the case where the drinks are bad. Now we can calculate the entropy of each possibility and find that now our entropy is maximized when testing 5 untested drinks and 1 from the drinks that have already been tested once. And it's 0.9985133228266336 As you can see, these entropies are all very near the maximum that can be found from a yes/no test (which is 1 bit). So it's likely that the theoretical minimum is very close to the true minimum. We can use this approach for the case where the number of poisoned drinks is constrained as well; however, it's more complicated because the probabilities are mutually dependent. So you'll have to update all of the probabilities every time a test occurs. For instance, if we determine that 7 drinks do not have poison, then the remaining 93 drinks each have a higher chance of being poisoned (.1/.93 = 1.075ish). Now entropy is maximized by testing 6 drinks instead (6: 0.9999181786156655, 7: 0.9930579426000998)
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (98)
Joined: 12/12/2013
Posts: 380
Location: Russia
Well, I wrote a program, actually few, to use my strategy above. And, here is what I got: First, I tried to make this program without taking nice moves, and as expected I got a bit above 632 rough estimate. But it was due to bug. Then I add binary search in case when only one poisoned is remaining, and got 541: and was like o_O. I made some debug output, and it turned out that 541 was from 99 positives, and for some reason it said in this case you need 7 steps. And, it actually true, because we have 99 positives, and we decide from 99, so power of two is 7 not 9 as I've mentioned above. But, obviously I had a bug. And bug was with range of possible positives, so not all positives was considered. After it was fixed I got 633 and was like - why it's dumb and can't find 632 steps described above. And, it turns out that description above has mistake. :D Correct calculation is following: 333 (groups) + 301(summary size of groups) - 1 (we don't check last one) = 633. So, even with binary search, it wasn't able to find better than 633 steps "dumb approach". So, I had to add advanced knowledge that we already used a bit. It's when there are all negative in positive group, then remaining one is poisined. It's very tricky to apply correctly, because in some cases there are more 'empty' groups, in some cases there are less of those. So, idea is to find out number of minimum emtpy groups using pigeonhole principle or something similar. For example, if there are 99 positives, then there is possible that all of groups not empty. And, number of positives is always less than number of poisoned, so this information not enough to derive minimum number of 'empty' groups. So, I had to also consider how many poisoned drinks we got after tests one by one. And, knowing this information, if we have found 88 drinks among 99 positives, it means that at least 11 groups is empty, so at least 11 additional drinks are found! After this addition, I got result 568! So, in worst case you can solve it in 568 steps using this approach if my calculations and my program is right :D Now, a bit about worst case. Surprisingly, worst case turns out to be 85 positives. My program says that you need 333 tests to test groups, then 85 largest groups one by one except one in each group = 4+3*(85-1) - 85 = 171. And, it says worst case is when we found 85 poisoned drinks during 171 tests, so none of them are empty. And we have to find 15 poisoned drinks among 85 remaining drinks under suspicion. And, in worst case it says we need 64 steps for that. I thought worst case would be 75 positives, but my program tells it needs 558 steps, other interesting thing is that for 75 and less positives, worst found drinks is range with 75 in it, so 75 is also worst. And with decreasing number of positives this range is increasing. I need to mention, that I was always considering as worst case is when positive groups are largest in size. I think following proof works: if we know how to find out poisoned drinks in K steps for largest sizes, then we can always add 'dummy' - non poisoned drinks and do all our operations with them even though we know they are not poisoned. Thus, we have groups of largest possible sizes, and we know how to solve them in K steps. So, always consider largest groups in summary size. Okay, it was results of using strategy mentioned in previous post with some kind of pigeonhole principle applied. Now, thoughts regarding optimality. I though about entropy approach, and depending on positive and negative groups any consequent group size is varies. But lets for a moment forget about this fact. Lets assume we know list of sizes. We know that all negative (no poisoned drinks in group) - whole group is decided. So, in the end we only need positive groups. Also, lets assume we don't use intersecting groups. And here is what I think: now we know only total number of poisoned drinks we have, but no information about number of poisoned drinks within groups. It means, in worst case any next group that pick drinks from several groups may have positive result and gives no information. It's not strict statement. It's just intuition, it's not proof and very likely is wrong. But I think so because we get more information (intuitively) if we split this set of picked drinks by groups that was already picked, we get more information. For example, if we try set 3,4,5,6 and we already tried set 1,2,3,4 and also 5,6,7,8, so better would be split our set into two: 3,4 and 5,6. So, if we try 3,4 and it's positive, we know that 3,4,5,6 positive, and also know that 3,4 is positive. but if 3,4 is negative, we know 3,4 is negative, and if we want to know about 5,6, we just also check 5,6. So, two checks to know about two sets. But if we use 3,4,5,6 we will need three checks in worst case. It's just intuition, not a proof. Also, we don't know how many poisoned drinks within the group. And best way to find out is to check all of them. So, if we belive in this, then we have to find out how many poisoned among all groups, and we need again check all of them one by one. This leads to question: how large is total size of all positive groups using entropy method? I don't know. it's hard task. Actually, better to know what maximum total size of all groups for each possible number of positive groups. Then, I can try my approach with it, and probably it could be better. But we know that this size is not greater than (positives*3)+1. Other thing: for n > 1 to find (n-1) poisoned drinks from n you need n-1 steps. Even though entropy is same as to find 1 among n. You can't do better. because the only one set for test can reveal answer: pick non poisoned one. I think similarly you can prove for n > 2 that you need n-1 steps to find (n-2) poisoned drinks among n drinks. Also, optimal way to find out how many poisoned drinks among n require n steps. Other idea is following, if I always pick largest positive groups among picked groups size, then imagine we don't care about probability, and just trying to minimize for each possible number of positive groups total size of largest of them. I didn't think in this way much, so I can't say anything about it yet. Ah, I didn't mention, that to find 2 poisoned drinks among n there is better strategy similar to binary search, but I was lazy to add it. Idea is to try to find two non-intersecting largest groups that are both positive. And then, do binary search within them. Also, it may be better for larger number of poison drinks as well. Also, little note: my program can't find better than 999 steps solutions for number of poisoned drinks from 333 and above. first less than 999 steps is 998 steps for 332 poisoned drinks :)
Patashu
He/Him
Joined: 10/2/2005
Posts: 4046
Here's a programming/dictionary challenge: Target is a word game where the creator scrambles a nine letter word and designates one letter as the target. The solver's goal is to find all words that include the center letter, are 4+ letters, aren't plurals ending in s and aren't proper nouns. Obviously there will always be a nine letter word, but some Targets have more or less other words to find. What puzzle has the MOST and what has the LEAST?
My Chiptune music, made in Famitracker: http://soundcloud.com/patashu My twitch. I stream mostly shmups & rhythm games http://twitch.tv/patashu My youtube, again shmups and rhythm games and misc stuff: http://youtube.com/user/patashu
Skilled player (1419)
Joined: 10/27/2004
Posts: 1978
Location: Making an escape
Are "read" and "read" separate words? :P
A hundred years from now, they will gaze upon my work and marvel at my skills but never know my name. And that will be good enough for me.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Here is a fun one I recently found out in a YouTube video, I haven't seen it published anywhere. Let A and B be vectors in R2. An ellipse can be defined as the set of vectors P such that: P = A*cos(x) + B*sin(x) + C (*), where C is the vector that denotes the center of the ellipse. Notice that I do not require that A and B be perpendicular, as is commonly done. In fact, no requirement is necessary. Every pair of vectors A and B will define an ellipse, even linear dependent ones, if you consider a line segment as a degenerate ellipse. The consequence of this is that an ellipse does not have a unique representation by formula (*), there's an infinite choice of vectors A and B that will define the same curve. Besides that, we also have several formula for the ellipse in vector form: Area: S = pi*|A x B|, where A x B denotes the vector product Hypotenuse: D² = |A|² + |B|² Inside test: |(P-C)xA|² + |(P-C)xB|² < |A x B|², for a point P Tangent test: |R x A|² + |R x B|² = |Rx(P-C)|², for a point P with direction vector R The amazing thing is, all the previous formulas work for any parametrization of the ellipse with vectors A and B, they don't need to be perpendicular! Prove it.
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
blackpenredpen tackled recently this problem from the 1990 AIME math competition. Looks fun. If: ax+by = 3 ax2+by2 = 7 ax3+by3 = 16 ax4+by4 = 42 then: ax5+by5 = ?
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Yup, very fun! The trick is to multiply by x+y (ax^2+by^2)(x+y) = ax^3 + by^3 + xy(ax+by) 7(x+y) = 16 + 3xy If you do the same to the cube sum: 16(x+y) = 42 + 7xy Solving the linear system: x+y=-14 xy=-38 Now do the same for the fourth power sum: 42(x+y) = ax^5 + by^5 +16xy Then: ax^5+by^5 = 16*38 - 42*14 = 4*(8*19-21*7)=4*(152-147)=20 I find it weird that it decreases, but it's possible depending on the sign of a,b,x,y
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
Matt Parker's math puzzle of the week. Link to video
Banned User
Joined: 3/10/2004
Posts: 7698
Location: Finland
I'm slightly disappointed at the lack of enthusiasm.
Skilled player (1419)
Joined: 10/27/2004
Posts: 1978
Location: Making an escape
That's the jeep problem, related to the harmonic series.
A hundred years from now, they will gaze upon my work and marvel at my skills but never know my name. And that will be good enough for me.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
With the quarantine, I started taking a look at some math olympiad problems after a long time. Surprisingly, I actually managed to solve some of the easiest ones! One I liked a lot is Problem 1 of IMO 2017: For each integer a0 > 1, define the sequence of positive integers a0 , a1 , a2, ... by: an+1 = sqrt(an), if sqrt(an) is an integer, an+1 = an + 3, otherwise Determine all values of a0 for which there is a number A such that an = A for infinitely many values of n. Of course, IMO is infamous for very challenging problems, but this one is very doable. Keep playing around with the sequence until you notice a pattern and try to prove it. The solution is long, but I think it follows quite naturally. EDIT: Fixed typo as per the post below.
Editor, Expert player (2083)
Joined: 6/15/2005
Posts: 3289
p4wn3r wrote:
Determine all values of a0 for which there is a number A such that a0 = A for infinitely many values of n.
I think you mean an = A for infinitely many values of n. Otherwise the answer would be all natural numbers. I've seen this problem before, and yes, we finally get to use mathematical induction! I'll comment a bit later on how to do this; I would normally be all over these problems but I've been way too busy with real life for the last 8 months or so.
Ferret Warlord wrote:
That's the jeep problem, related to the harmonic series
Referring to this: https://en.wikipedia.org/wiki/Jeep_problem There was actually a related problem we discussed before about camels and bananas. It's somewhere in this thread but I don't have time to go back through everything to find it.
Player (36)
Joined: 9/11/2004
Posts: 2631
The IMO problem. If a0 is either exactly 1 or congruent to 0 mod 3, then it will enter an infinite loop and satisfy the condition. Lemmas: 1. The sqrt of a square integer will be divisible by 3 if and only if the square integer itself is divisible by three. This is easy to see. If n2 = (3k)2 then the factor of 3 will still be present after the sqrt peals away the exponent. 2. If an is congruent to 2 mod 3, then the sequence will diverge. This is because all square integers are congruent to 0 or 1 mod 3. There are no square integers congruent to 2 mod 3. And the action of +3 preserves value mod 3. To see why there are no square integers 2 mod 3: 0^2 mod 3 = 0 mod 3, 1^2 mod 3 = 1 mod 3, 2^2 mod 3 = 4 mod 3 = 1 mod 3 Cases: If a0 = 1, then trivially we get the sequence {1, 1, 1,...} If a0 = 0 mod 3, then If a0 > 9, then let (3k)2 be the largest square integer strictly less than a0. Then the action of +3 will bring the value to (3(k+1))2, and the action of sqrt will leave 3(k+1) which is strictly less than (3k)2. This is true for any k greater than 1. When k = 1, then we get the repeating pattern 3, 6, 9, ... so all initial values 0 mod 3 will eventually enter this repeating pattern. If a0 = 2 mod 3, then by Lemma 2, it is divergent. If a0 = 1 mod 3, then the action of sqrt will take this sequence either to 1 mod 3 or to 2 mod 3. Subcase where (3k + 1)2 (with k > 0) is the next value reached by the action +3, then depending on the value of k, the next square integer will be either 1 or 2 mod 3. If this value is 2 mod 3, then the sequence diverges. Otherwise the next square integer will be (3l + 1)2 with l <k>= k, then (3k + 1) >= (3 (k-1) + 1)2. Solving for k, we get k = 1 as the only integer value where this is true. However k = 1 gives 4, and sqrt(4) is 2 mod 3, which is divergent. Therefore, all values that are 1 mod 3 are divergent due to eventually "finding" a value of k such that 3k + 1 is 2 mod 3.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
I really liked the Mathologer video proving quadratic reciprocity: Link to video The argument is based on a proof by Zolotarev which recasts the problem on signs of permutations. Apparently, some guy found a nice geometrical interpretation of the permutations considered by Zolotarev. I think the video could be improved, though. Most of the statements he introduced about the sign of permutations look like magic, but are in fact very simple when you study them carefully. Also, he does not explain why the multiplicative group of the field of integers mod p should have a primitive root. Maybe it's just me, because I know most of the math concepts, but I also thought his final observation, where he uses a relabeling to prove that the sign of the permutation is the Legendre symbol very difficult to follow. The argument in the blog post, which uses permutation cycles, although more mathy, is more natural in my opinion.