I'm going to admit I haven't read the last few posts because it's late and I'm mostly bored, but I see that they're rather long and involved, using a lot of programming and invoking Mobius transformations among other things. Which is all fine, because there's lots of insight and additional value to be earned from diving deeper into a problem such as this.
But I for one found this week's Riddler Classic to be kind of trivial. I usually at least put pen to paper, but this week I just kind of ran through the math while I was in the shower. Roughly speaking, we work backward and wish to avoid situations where our two numbers drift away from each other, such that (as is eventually inevitable) m>n (or n>m... I forget how we're choosing to label them). Well, relaxing the integer condition, we can make this series infinite by setting n=phi*m, where phi is the golden ratio. Likewise, any ansatz not such that n=phi*m will eventually drift away and terminate. Thus, we expect that the optimal solution will be the integer closest to n=phi*m or possibly second-closest.
So for the "true" puzzle, we have 81/phi = 50.06 and so we expect our next lowest number to be 50 (or possibly 51) and we can work backwards from there. For the extra credit, we get 179/phi = 110.63 and our first guess should therefore be 111 or perhaps 110. (I suspect it's "extra credit" because both solutions need to be checked and they are very close, but that doesn't require that much effort.)
Of course, all this is sort of "intuitive" and would need to be fleshed out into a full proof, maybe by quantifying the "drift" away from the golden ratio as the series progresses backwards. However, I would be rather shocked to find out I am a) wrong or worse, b) egregiously wrong, such that my solution is bested by another series that does many terms better.
I suspect much of this is outlined somewhere in the previous posts. I'll try to dive back into them tomorrow.
Well, for what it's worth, the Riddler posted the official solution. I was a bit disappointed. He mentions that people ran code to find the optimal value, but their code just scans all possible encodings, without even using dynamic programming to optimize the recursion. He then claims without proof that you should pick numbers close to the golden ratio and look at the nearby ones if they are greater, again without even giving a bound on how many you should check. What a bummer, the proof is pretty interesting.
It looks like most of us found intuitively that the second last number (denoted m from now on) has to be close to N/phi. (Here I'm thinking about the sequence backwards from (N, m, ...) , until you get to a term which is greater than or equal to the previous term.) Indeed, as p4wn3r showed, a maximal sequence is achieved either by m=floor(N/phi) or m=ceiling(N/phi). The length of a sequence backwards from (N, m, ...) is based on tracing p4wn3r's fixed-point diagram:
but tracing it backwards starting from x=N/m (going outwards) until the x value is less than or equal to 1. Note that it is not actually possible for floor(N/phi) and ceiling(N/phi) to both be maximal sequences for the same N; by alternation, one has even length and the other has odd length.
What I found very curious is that, from my calculations, most N have a unique maximal sequence, but for some N, there are actually two maximal sequences. I wasn't able to find any N that had more than two maximal sequences so I'm not sure if they exist (I checked up to N=20000 or so). No, this is not acknowledged anywhere in Riddler's solution, not even in the PDF on the page that gives a list of maximal sequences up to N=200.
There appear to be an infinite number of them but the first few with two maximal sequences as far as I can remember are N=6, 15, 32, 35, ... Considering the Fibonacci fractions 1/1, 2/1, 3/2, 5/3, 8/5, ... mark the boundaries for which values closer to phi give longer sequences, some values of N have two values m1 and m2 where N/m1 and N/m2 both fit in the same "gap". For example, for N=35:
35/23=1.5217...
35/22=1.5909...
Both conveniently fit in the gap between 3/2=1.5 (exclusive) and 8/5=1.6 (inclusive). This gives the two maximal backward sequences (35, 23, 12, 11, 1, 10) and (35, 22, 13, 9, 4, 5). You can check that 35/21=1.6666... is exactly 5/3, so it fits in the gap from 2/1=2 (exclusive) to 5/3=1.6666... (inclusive), a worse gap than 3/2 to 8/5.
I didn't think about it too much, so perhaps there is a proof that an infinite number of N have two maximal sequences, and/or a proof that there is no value of N with more than two maximal sequences, but I probably won't come back to that anymore.
----
As a quick problem, I will present the newest Riddler Express (I will just copy and paste the description):
You have a case with 11 golden globes, weighing 1 kilogram, 2 kg and so on, up to 11 kg. And you know exactly which globe is which.
You have arranged to sell one of the globes to a queen. She has heard tales of these orbs, and knows for a fact that they have masses from 1 kg to 11 kg. However, she doesn’t know which is which and will not simply take your word for it. She will purchase a globe if you can demonstrate its weight.
The queen has a balancing scale with two plates, one on each side. It shows whether the total weight on either plate is equal or, if not, which side is heavier. The queen can clearly see which globes you place on which plate. However, if at any point the mass on either side exceeds 11 kg, the scale will break and the queen will refuse to buy from you.
The queen is in a hurry for her next appointment, so time is limited. What is the fewest number of weighings by which you can prove the weight of at least one globe? Which globe is it?
Answer: Only one weighing required, proves the weight of the 11kg globe. Reasoning: Put 1 2 3 4 on one side and 11 on the other. Four globes on one side weigh a minimum of 10kg. This confirms the lone globe on the other side to be 11kg, since it's the only globe that is possibly heavier than four globes.
I'll ignore the problem because I want to make a very special post here. TL;DR it's about this image:
This image is an Apollonian gasket, and a very asymmetrical one. I actually used a 3-4-5 triangle where the hypotenuse is in the x axis to generate it. Most importantly, it's generated in a popular fractal generation software called Apophysis. I put the .flame file for the render in GitHub.
The thing is: to my knowledge no one has done that before. The reason is that Apophysis uses an algorithm called IFS to render the fractal, and you actually need to come up with an IFS that will converge to the Apollonian gasket. From what I've seen in the literature, the only cases where an IFS was known were for very symmetrical configurations of the gasket and they actually converged only to a part of it. That's not a problem in most situations, because you can always pick a simple case and transform it to a crazier configuration, but I think that's cheating. I always thought that this limitation of the methods was odd, because my intuition told me that the general case shouldn't be more difficult to solve than the simple ones.
So, after working on this problem on and off for the last week, I can finally say... I DID IT!!!!! Yay, go me. I got an original result.
Anyway, the Github repo I linked before has the code I used, I wrote a Medium article explaining it. I hope it's useful/fun for people here. If you have reached the monthly limit in Medium, reach out to me and I'll send it to you.
The Riddler Express problem in my previous post was designed to have a simple solution (that's exactly why it is Express and not Classic):
Answer: Only one weighing required, proves the weight of the 11kg globe. Reasoning: Put 1 2 3 4 on one side and 11 on the other. Four globes on one side weigh a minimum of 10kg. This confirms the lone globe on the other side to be 11kg, since it's the only globe that is possibly heavier than four globes.
In other news: I mentioned in my previous post that I couldn't find any N that had more than two maximal sequences up to N=20000. I managed to find a proof that there is indeed no such N. The general idea is that it is impossible to have three or more N/m values in a gap, while not having any value in a better gap.
This confirms that to completely solve the problem of all maximal sequences for N only requires checking three values:
1) m=floor(N/phi)
2) m=ceiling(N/phi)
3) If (1) is maximal: m=floor(N/phi)-1; if (2) is maximal: m=ceiling(N/phi)+1
I'll try to find some time in the next few days to write out the proof.
Here's a problem I came up with today.
Suppose you have a non-abelian group G. For this kind of problem, it's much more difficult to work with a concrete group than to look at its general properties. Anyway, the properties we want are:
* For any a and b in G, it's easy to calculate their product ab, and also their inverses a-1 and b-1. It's also easy to generate a random element t in G.
* From the property above, it follows that, given an element a in G, you can generate a secret element s, and evaluate the conjugate sas-1.
* We now assume that the conjugacy problem, that is, given two elements a and b, finding an element g such that a = gbg-1, is computationally intractable.
Now, the challenge is the following. Suppose that we have an element a of G, which is public knowledge. Suppose someone, named Alice, generates an element s of G as a private key and publishes the conjugate b = sas-1 as a public key. That all makes sense because finding the conjugate is hard, from our assumptions. Now, Alice wants another person, named Bob to confirm that she knows the secret. The easiest way to do this is, of course, to show s to Bob. Then, by verifying the identity b = sas-1, Bob can confirm the secret. However, that's undesirable because Alice doesn't trust Bob, and doing this she would lose her secret. So, is there a way that Alice can show to Bob that she knows the secret without Bob being able to deduce what it is?
Here's the solution. It's based on the concept of a zero-knowledge proof. The idea is: by knowing the secret s, there are questions which only Alice can answer reliably. So, by answering a set of questions, we can confirm that Alice knows the secret without her revealing it.
In our case, suppose Alice generates a random element t of the group G. She then computes c = tsas-1t-1, and publishes c. Now, Alice can answer 2 questions:
(1) Which element g in G is such that gag-1=c ?
(2) Which element g in G is such that gbg-1=c ?
Now, only Alice knows the answer to questions (1) and (2). They are ts and t, respectively. If someone who's not Alice can compute the answer, then the conjugacy problem is not hard, violating our assumptions. Also, the answer to these questions is just something randomly generated, so Alice is not disclosing the secret.
However, we are not done yet, because there are some weaknesses in simple protocols. Suppose we design a protocol, where Alice publishes the element c, and Bob asks (1). Now, someone can impersonate Alice, because they can generate a random element u' and publish c'=u'au'-1, and answer (1) without knowing s. Similarly, someone can also impersonate Alice if Bob asks (2). They can generate an element u' and calculate c'=u'bu'-1 and again answer (2) without knowing s.
Can we solve this by requiring Bob to ask (1) and (2) simultaneously? Well, that does work to verify Alice, but unfortunately leaks the secret s. That's because s = t-1(ts), so Bob can retrieve the secret by multiplying the inverse of question (2) with the answer of question (1).
The trick is, then, to require Bob to ask (1) and (2) randomly. That way, if Alice does know the secret, she can answer any of the questions, while someone impersonating her only has 50% chance of getting it right. By repeating this procedure many times, we can make the probability of an impostor to pass the test extremely small, and our protocol works.
I finally got some spare time. As promised, I will explain how I found that it is impossible to have more than two distinct maximal sequences that both end in N:
As p4wn3r explained before, the Fibonacci fractions 1/1, 2/1, 3/2, 5/3, 8/5, etc. represent the boundaries for the quantity N/m regarding the length of the sequence ending in (...,m,N). The fractions approach phi=1.618...
(I added numbers to show how long the maximal sequence will be for each interval.)
Alternatively, we can look at the reciprocal fractions 1/1, 1/2, 2/3, 3/5, 5/8, etc. which are the boundaries for the quantity m/N regarding sequence length. The fractions approach 1/phi=0.618...
In order for there to be three or more maximal sequences, there needs to be three separate values m1/N, m2/N, m3/N all in the same interval, and no m/N in any better interval. The situation is shown below, showing the red interval labelled "n" where we want three values, and no value must be in the blue interval labelled ">n". F(n) or Fn means the nth Fibonacci number. (Note: This diagram is for n even. If n is odd, the diagram is reversed so the red interval is below 1/phi, rather than above it. Similar argument.)
Noting that each value m/N is equidistant from each other, it does not appear possible to put three values in the red interval and none in the blue interval. To show this, we just need to show that the red interval is less than twice as long as the blue interval. Or equivalently, the distance from Fn-2/Fn-1 to Fn-3/Fn-2 is less than three times that of the distance from Fn-2/Fn-1 to Fn-1/Fn:
|Fn-2/Fn-1 - Fn-3/Fn-2| = |Fn-22 - Fn-1Fn-3|/(Fn-1Fn-2) = 1/(Fn-1Fn-2)
|Fn-2/Fn-1 - Fn-1/Fn| = |Fn-2Fn - Fn-12|/(Fn-1Fn) = 1/(Fn-1Fn)
where the numerators have absolute value 1 because of Cassini's Identity.
Since Fn is less than three times as large as Fn-2 for n>=5 (note that Fn/Fn-2 tends to phi2 = 2.618...), it follows that 1/(Fn-1Fn-2) is less than three times that of 1/(Fn-1Fn), proving that it is impossible to have three values of m/N in the red interval and none in the blue interval, when n>=5. The cases n=2,3,4 can be ruled out simply by checking exhaustively for N<=6, and observing that for N>=7, there is always a value of m/N in an interval n>=5.
----
Yes, I solved it two or three days ago knowing about zero-knowledge proofs and using the discrete log example on that page as a basis. I ended up with a solution that is practically the same as yours. Though I don't think anyone would be able to solve this unless they knew about zero-knowledge proofs in the first place.
P.S. The term "proof" is actually misleading; it's more like "zero-knowledge protocol" since there is always a nonzero possibility of falsely passing. Think about how some "primality tests" don't actually (in the mathematical sense) prove that a number is prime, and it's similar.
This week's Riddler classic is a pretty standard number theory problem. Find a six digit Carmichael number which can be written as ABCABC in base 10. As the Riddler notes, try to solve without writing a brute force code, you learn a lot of number theory doing this.
In other news, I made a lot of progress understanding the math behind the Apollonian gasket since my last post! I improved my code a lot, and manage to automate a lot of things to let me try things faster, and I managed to figure out the properties of the transformations that generate the attractor. One very nice result I got is that there's a version of the gasket which is unbounded, have a look:
See how the color gets weaker the farther we go, that's an intrinsic limitation of the rendering software, but I think it's very cool!
I should write something more elaborate later, but here are the formal properties behind the attractor. You need two Möbius transformations S and T. We take S a diagonal loxodromic transformation, whose multiplier is an ugly complex number you get by solving a quadratic equation with complex coefficients. For T, we take an elliptic transformation that rotates by 120 degrees, so it satisfies T3=I. It also must satisfy another property. An elliptic transformation has two fixed points, and we must choose these fixed points as complex numbers a and b such that their ratio a/b equals an even uglier constant, which is found by multiplying lots of square roots. Once you have these, you construct the two functions in the attractor (notice that my previous approach used four!):
F = ST
G = TS-1
One interesting thing about these generators is that they satisfy (FG)^3 = (GF)^3 = I. You can try proving this if you are bored.
Anyway, if you pick a random Möbius transformation P and conjugate F and G by it, you get another gasket. While it's pretty elegant to generate the transformations like this, it has a drawback, it's pretty impossible to know what the gasket will look like, and you have to input two very complicated constants that, while they do have closed forms, it's much easier to evaluate them numerically. Recently, I've been coding a hybrid of these approaches, where I start with a configuration where I know what the gasket will look like, start changing these transformations, and then modify these constants a little bit. Obviously I don't obtain the Apollonian gasket, but another fractal that still looks cool. The problem is: this is taking a lot of work, which is why I haven't written anything in detail yet, but I should be able to finish soon.
Let me know if you guys find this interesting!
I kind of brute forced it, but not via code. Let n = ABCABC = ABC * 1001 = ABC * 7 * 11 * 13. This means that 6, 10, and 12 must divide n - 1. Take the prime factors of those numbers, and you get that 2, 3, and 5 must divide n - 1. Therefore, so must 2 * 3 * 5 = 30. So we have 30 x = n - 1 for some x, or 30 x + 1 = n.
Here's where the brute force comes in. Since n is odd, C must be one of 1, 3, 5, 7, 9. Let's assume that C = 1. Then we have 30 x + 1 = 7 * 11 * 13 * (10 * AB + 1). You can simplify that to 3 x = 7 * 11 * 13 * AB + 100. Taking everything modulo 3, you get 0 = 1 - (A + B). You can generate a list of the values of A and B that satisfy that relationship. It turns out that A = 1, B = 0, C = 1 yields a Carmichael number (I used this to confirm that 101101 is a Carmichael number) with the desired properties:
Its prime factors are 7, 11, 13, and 101, none of which is a square.
As usual, I will verify all possibilities, as in: "find all possible six digit ABCABC..." as opposed to "find any one six digit ABCABC...". (Intuitively, ABC is most likely to work when it is a prime that is one more than a factor of 1000. Doing so gets you quickly to 101101 which is verified to work, and so is sufficient to answer the question as it is stated.)
ABCABC=7*11*13*ABC (ABC not necessarily prime)
So 6, 10, 12 must all divide ABCABC-1; this means C has to be 1, and AB1 must be 2 (mod 3) and 1 (mod 4). There are two cases:
Case 1: AB1 is prime. Then AB0 must divide AB1AB0, and so AB0 must be a 3-digit number that divides 1000. Only one possibility allows AB1 to satisfy all conditions so far: AB1=101. This gives the solution 101101.
Case 2: AB1 is not prime. From the conditions above, AB1 can't have 2, 3 or 5 as a factor. Additionally, AB1 can't have 7, 11 or 13 as a factor, since squared prime factors aren't allowed. So AB1 must have distinct factors from 17, 19, 23, 29, 31, 37, 41, 43, 47, 53 (larger factors would create numbers that are greater than 3 digits.) The only ways these factors can multiply to a 3-digit number AB1 are: 17*23=391 (not 1 mod 4), 17*43=731 (not 1 mod 4), 17*53=901 (not 2 mod 3), 19*29=551 (not 1 mod 4), and 23*37=851 (not 1 mod 4). None of them work, so there are no solutions when AB1 is not prime.
So the only solution is 101101.
Note: I have assumed that A is not 0, since the question clearly states finding six digit ABCABC. If A were allowed to be 0, then another solution would be 041041. (This is just an extension of Case 1 where we also check when 0B0 divides 1000.)
Very nice solutions. I did it applying the Chinese remainder theorem. The nice thing about this theorem is that it gives the precise conditions where we can lift a system of congruences to a single congruence in another modulus.
As mentioned before, a number of the form n = ABCABC can be written n = k*1001 = k*7*11*13. The condition of divisibility by the divisors can be stated as:
n = 1 mod 6
n = 1 mod 10
n = 1 mod 12
This is not in the form where we can apply the Chinese remainder theorem, because we need all modulus to be coprime. However, we can work around this. See that n = 1 mod 12 implies n = 1 mod 6, so we can drop the first one. Also, n = 1 mod 10 implies n = 1 mod 5, and we have:
n = 1 mod 5
n = 1 mod 12
The Chinese remainder theorem tells us that this is equivalent to a congruence mod 60. You can use the formula there or just notice that n = 1 mod 60 implies both. Now, if we substitute n = k*1001 and notice that 1001 = 41 mod 60, we must find k*41 = mod 60. That is, we must find the multiplicative inverse of 41 modulo 60. Again, you can use Euclid's algorithm to do this, but I simply tried all of 11, 21, 31, 41 and 51, since the multiplicative inverse must end in 1. Doing this, you'll see that 41*41 = 1681 = 27*60 + 41 = 41 mod 60. So, 41 is its own multiplicative inverse. So, we need to try all numbers k = 41 mod 60 in the range, which are 41, 101, 161, 221, 281, 341, 401, 461, 521, 581, 641, 701, 761, 821, 881, 941.
To check them, I did basically the same as FractalFusion, 41041 and 101101 are the only solutions.
It is, of course, possible to prove that there are an infinite number of primes by observing that for any list of primes $\{p_i\}$, you can form the product and add one to generate a number that is either prime or has a prime factorization that only contains primes not on your list.
If we actually attempt to generate primes using this algorithm we get the following sequence: 2, 3, 7, 43, 1807, 3263443, ...
If we factorize these numbers we receive the following sequence: 2, 3, 7, 13, 43, 73, 139, 181, 547, 607, 1033, 1171, ..., which are the only primes that we will generated by this algorithm. (Other primes get trapped into loops mod p, for instance 5 gets trapped in the cycle 2, 3, 2, 3, ... mod 5).
If we examine the ratio of the generated primes vs the total primes up to n we seem to have a falling trajectory, and it would make sense if the ratio converges to 0 as n goes to infinity, these primes seem to be quite rare. This makes intuitive sense if we think about this particular algorithm mod p as needing to generate a 0 mod p before generating any other value it's already generated, we can loosely consider the action of the algorithm as a random draw with replacement, and it's easy to see that the probability of drawing a 0 is much less than drawing any two numbers in a row due to the Birthday Paradox. (That being said, the Euclidean algorithm definitely is not a random process mod p, and I'm not even certain it is particularly well-approximated by one.)
(x-axis is natural numbers and y-axis is the ratio)
(x-axis counts the nth prime and y-axis is the ratio)
Can we prove this ratio tends to zero?
Also beginning with other seed values will generate different sequences. All odd primes will immediately generate 2, so 2 will appear in all such sequences. Are there other primes that appear more often than chance would dictate? Less often?
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Yesterday's smbc had an interesting mathematics challenge: https://www.smbc-comics.com/comic/derivative
A = The number of symbols in a function
B = The number of symbols in that function's derivative.
Homework: Find the function of finite length with the highest ratio of B to A.
Now, obviously you have to decide for yourself what 'symbols' means, but after doing that, what's the best you can come up with?
I asked a friend who came up with a very pleasing solution, but it might be possible to do even better: https://xeno.chat/@bj/108199208965188517D(e^x)=e^x
D(e^(e^x))=e^(e^x) e^x, thanks to the chain rule
D(e^(e^(e^x)))=e^(e^(e^x)) e^(e^x) e^x, and so on
The derivative of an exponent-tower of e^e^e^...^x is a product of these towers starting from the original and going down to e^x. The number of symbols on the left side increases linearly, but the number of symbols on the right increases quadratically, so the higher you make your tower, the higher your ratio is, and you can make the ratio arbitrarily large.
Obviously if you accept this answer, the question evolves past 'what is the highest ratio' into 'what's the fastest growing function you can write to produce higher ratios faster? Can you make bigger polynomials? Exponentials? Bigger?'
There are two rules that allow the combination of functions (in a way that increases the total terms): the chain rule and the product rule. (The quotient rule is just the product rule in funny clothing, and so doesn't increase the complexity more quickly than the product rule alone.) Both of these have the effect of a quadratic increase in the number of terms. Can we do better than quadratic by composing the two methods of combining functions?
If we consider the set of abstract functions f_I (where I is an arbitrary dimensional multi-index) which are collectively differentiable over some common interval, which cannot be simplified through combination of themselves or their first derivatives, then we can consider some recursive definition of composite functions g_I, where the example given in the last post is structured similar to:
g_0(x) = f_0(x)
g_n(x) = g_{n-1}(f_n(x))
And maybe a more complex example might be where at every level of function nesting we have a product of two functions:
g_0(x_0, x_1=x_0) = f_0(x_0) f_1(x_1)
g_1(x_00, x_01=x_00, x_10=x_00, x_11=x_00) = f_0(f_00(x_00) f_01(x_01)) f_1(f_10(x_10) f_11(x_11))
g_n({x_{I0}, x_{I1}=x_{I0}}) = g_{n-1}({f_{I0}(x_{I0}) f_{I1}(x_{I1})})
Where the notation {x_{I0}, x_{I1}} denotes a set of 2^{n+1} arguments with I being a valid n-dimensional multi-index running from 0 to 1 in each dimension. The indexing notation here is getting a bit dizzying but the upshot is these arbitrary recursive composite functions are themselves just plain functions. Because these functions are themselves differentiable over the same common interval, then we can just examine the recursive case. And because the recursive case is just a plain function the best we can do is quadratic as compared to the input.
I hope that's careful enough.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Assuming that we are using only elementary operations (addition, subtraction, multiplication, division, exponentiation, root) and elementary functions (e^, log(), ln(), abs(), and all trig and inverse trig functions):
The operation I could find that produces a large number of operations in its derivative is the f(x)-th root of x (that is, x^(1/f(x)), considered as a single operation joined to the function f(x), which has some number of operations itself).
By taking the derivative of x^(1/f(x)), we get x^(1/f(x)) * [f(x)/x - ln(x)*f'(x)] / (f(x))^2. That's three f(x)'s and an f'(x) in the formula, which gives rise to a lot of operations.
Let fn(x) be the nth iteration of rooting: that is, f1(x)=x and fn(x)=x^(1/fn-1(x)). In proper math notation, it will look like this (I had to use Paint because even LaTeX can't generate this properly):
Let c(n) be the count function of operations of the derivative of fn(x). Then c(1)=1, c(2)=7, and
c(n) = n+1+(n-1)+1+3+c(n-1)+n = 3n+4+c(n-1) (assuming I counted operations correctly)
or, in other words, the recursion c(n) - c(n-1) = 3n+4 for all n>=3. Solving this gives a quadratic 1.5n^2+5.5n+C, and with the value c(2)=7, we get c(n)=1.5n^2+5.5n-10 for all n>=2. (Note that as an exception, c(1) doesn't follow this rule because f(2)'s derivative can be simplified to reduce the number of operations, resulting in c(2)=7.)
So this gives quadratic growth with a coefficient of 1.5, the best I can find.
NOTE: The operations f(x)^x ("f(x) to the x") and f(x)^(1/x) ("f(x) root x") also generate three f(x)'s and an f'(x) in their derivatives, but unfortunately their iterations are easily simplified. This disqualifies both of them.
Edit: Added "the derivative of" in there. We're obviously taking the number of operations of the derivative of fn(x), not fn(x) itself.
I wish to "cheat" at Twitch predictions. By "cheat", I actually just mean to look at statistics on how users wager and determine when it would be ideal to wager.
For those not familiar with it, on some channels when the streamer so cares, they can "open a prediction" in which users wager channel points. For the sake of this discussion, we'll assume that the predictions are binary. All channel points are distributed to the winners of the prediction according to the number of channel points wagered.
For example, a prediction might be, "Will [streamer] beat the next level in under two minutes?" After five minutes of open wagering, let's suppose 300,000 channel points are wagered on "Yes" and 100,000 channel points are wagered on "No", giving 3:1 odds overall. If I were to wager 60 points on "Yes", I stand to lose all 60 points if I'm wrong but gross 80 points (net 20) if I'm right. If I were to wager 60 points on "No", I stand to lose 60 points if I'm wrong but gross 240 points (net 180) if I'm right.
If viewers a) had perfect information and b) were perfectly rational, the efficient-market hypothesis states that it would be impossible for me or anyone to gain channel points through wagering. But I suspect (and already have some evidence in my favor) that viewers are a bit optimistic, being too eager to bet points on the streamer's success just because they want to show their belief in them. The question is whether I can quantify this bias and understand its limitations to produce a betting strategy.
Now, I already have a method for doing this and it's produced some interesting results, albeit with a small sample size, but I'm curious what the canonical method of analysis is. This is surely a well-studied problem in statistics and it's a bit of a weak point in my education.
I am not familiar, but it sounds like you are describing sports betting. I'm sure there are at least a thousand articles out there that you can find.
That's all I can really say about it.
Here's a problem I've been thinking about:
----
You start with a positive number N>1 and want to get down to 1 in a number of moves. Each move consists of going from the current number to a smaller positive number random based on uniform randomness. That is, from k, randomly go to a number 1, 2, ... , k-1 , each with equal probability. Keep doing this until you reach the number 1.
For example, starting from 8, a sample three-move sequence is 8→5→2→1: Starting with 8, go from 8→5 (5 is chosen with probability 1/7), then from 5→2 (2 is chosen with probability 1/4), then from 2→1 (1 is chosen with probability 1).
Starting from N, what is the expected number of moves to reach the number 1?
Is this only coincidentally this week's Riddler Express? I wrote up a solution up to N=8 while I was bored over dinner, but forgot to take the slip of paper with me and haven't bothered to reproduce the results I found.
Anyway, regarding my previous post over Twitch wagering, I plotted a random walk of Twitch wagers. The horizontal axis sorts the wagers from most favorable odds to least favorable (although least to most would work just as well). For the vertical axis, take a step either up or down weighted according to the likelihood of success. For example, if viewers give a wager 7:3 odds in favor of success, then if the streamer is successful, take a step of 30 units upward, whereas if they fail, take a step of 70 units downward. By this construction, if wagers accurately reflect probability, the distribution should remain centered around y=0. Systematic deviation from this axis indicates an opportunity to earn channel points according to others' over-/underestimates. (Technically, of course, it is not the value of the function but its "derivative" that matters, but in the limit as the number of wagers plotted approaches infinity, we expect a non-differentiable function. We can still eyeball systematic trends as we would looking at stock prices, though. I plotted a cubic spline through it to help.)
The result: For the streamer whose wagers I've plotted, I've tracked about 35 so far and there appear to be inflection points roughly around 100% odds and 60% odds, a maximum near 80% odds, and a dearth of data in the 0% to 30% range. This means that for strong odds, viewers are actually overly doubtful and it is best to bet in the streamer's favor. For longer odds, it's then better to doubt (although the trend turns around again past 30%, but this is more noise than signal). Near 80 percent chance of success, it is best to not wager or, if I do wager, I can expect to break even.
I initially did this by writing out possibilities and looking for patterns, so here's hoping that the following symbolic solution doesn't make any horrible mistakes:
Let M(x) be the number of sequences leading from x down to 1 via the method described in the problem, where x > 1. When the starting number is N, there are N-1 possibilities for the next number in the sequence. Call this next number k. If k = 1, then we're done. For each 1 < k < N, there are M(k) sequences leading from k down to 1 that we need to include in M(N). Thus M(N) = M(N - 1) + M(N - 2) + ... + M(2) + 1. That last 1 is for the N→1 case. However, M(N - 2) + ... + M(2) + 1 is just M(N - 1), so M(N) = 2 M(N - 1). Obviously, M(2) = 1, which leads to M(N) = 2N - 2. Since each move occurs with uniform randomness, the probability of any sequence occurring that starts with a given N is 1/M(N).
Now let m(x) be the total number of moves in all sequences starting with x. We can deal with the total number because each sequence occurs with the same probability. If a sequence starts with N→A and has n moves, then the subsequence starting with A must have n - 1 moves. There are M(A) such subsequences, so we add M(A) moves in order to adjust for the extra N→A move in each one. This leads to m(N) = 1 + ∑ A = 2N - 1 [m(A) + M(A)]. The extra 1 is for the N→1 case. If you write out the sum, you notice that m(N) = 2 m(N - 1) + M(N - 1). You can do this substitution repeatedly to get a more general m(N) = 2x m(N - x) + x M(N - 1). m(2) is 1, so it would be nice to take x = N - 2 in order to not have to worry about that m(N - x). This yields m(N) = N 2N - 3 after some work. Finally, the expected number of moves is m(N) / M(N) = N / 2.
I haven't fully digested the content of your post. But I have worked a problem very similar to this with respect to neutron slowing down in my college courses. So I would expect them to have the same answer, which is log_2(N).
And in fact, when simulating this using 1000000 it only take handful of steps rather than 500000.
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
Here's my attempt.
Let f(n) be a random function that takes a positive integer, and we let Xn be a random variable representing a discrete uniform distribution from 1 to n-1 inclusive, and f(n) = n - Xn. This represents a single iteration of our process. Because there is no valid random variable X1, we must define the f(1) = 1. By applying this function multiple times we get the various iterations. f2(n) is the second iteration, f3(n) is the third, and so on. Because each step of the process is guaranteed to reduce the value, lim m->inf fm(n) = 1 for any finite value of n.
We can find the expected value of the smallest m such that fm(n) = 1 by noting that this process has no memory. So E(m | fm(n) = 1 and fm-1(n) != 1) is always the same given the same value of n. Let g(n) = E(m | fm(n) = 1 and fm-1(n) != 1) (for n > 2)
First let's find some base cases:
g(1) = 0, because 1 is already equal to 1.
g(2) = 1, because f(2) = 1 always
g(3) = 1/2 * g(1) + 1/2 * g(2) + 1 = 1.5.
This final line is because half of the time, f(3) = 1, which needs g(1) additional steps to reach 1. The other half f(3) = 2, which needs g(2) additional steps to reach 1, and the final + 1 represents the current step.
And the inductive case:
g(n) = 1 + sum from i=1 to n-1 1/(n-1) g(i)
This recursion is unfortunately not a simple kind of recursion. It is linear sure, but it is non-homogenous, non-constant coefficient, and of non-constant dimensionality. I don't have any idea how to tackle this recursion explicitly. As I mentioned in my previous post, I assume this value is strongly linked to the logarithm. I misremembered though, I reviewed my notes and it seems like it should be base e rather than base 2. So I assume the solution is ln(n) + O(1)
Empirical evidence seems to suggest that O(1) approaches the Euler-Mascheroni Constant. And in fact, it seems that, again empirically that the value of g(n) is simply the n-1(th) harmonic number.
Knowing this, there is almost certainly a really elegant way to restate this problem and arrive at this solution cleanly. But I still cannot see it.
EDIT:
Well, let's try induction then.
If g(n-1)...g(1) are Hn-2...H0 (where H0 is taken to simply be 0), then g(n) = 1 + 1/(n-1) sum from i=1 to n-2 Hi = 1 + 1/(n-1) sum from i=1 to n-1 (n-1-i)/i
This last transformation was recasting the sum as "columnwise" instead of "rowwise"
To explain, if you have:
H3 = 1/1 + 1/2 + 1/3
H2 = 1/1 + 1/2
H1 = 1/1
Then H3 + H2 + H1 = 3/1 + 2/2 + 1/3
Then if we bring the 1/(n-1) into the sum again, and perform partial fractions we get:
g(n) = 1 + sum i=1 to n-1 (-1)/n-1 + 1/i
We can see that the first term in the sum, after summation is simply -1, which cancels with the +1. And we can see that the second term is simply the n-1(th) harmonic number. So our induction holds!
Because we have g(1) = H0, and if that's not satisfying g(2) = H1 and g(3) = H2 just from the worked examples up there, we have our full proof.
It's not very satisfying though, because it took a guess and check method. If you have a way of arriving at this from first principles please post it!
Build a man a fire, warm him for a day,
Set a man on fire, warm him for the rest of his life.
My mistake was in thinking that every sequence for a given N has the same probability of occurring as every other sequence for that N. I'll have to do some more work on this.