Skilled player (1532)
Joined: 7/25/2007
Posts: 299
Location: UK
In general: (p-1) + (p-1)p + (p-1)p2 + (p-1)p3 + ... = -1 To see this, just add 1 to the sequence. 1 + (p-1) + (p-1)p + (p-1)p2 + (p-1)p3 + ... = 0 + (1+p-1) + (p-1)p +(p-1)p2 + (p-1)p3 + ... = 0 + p + (p-1)p +(p-1p)2 + (p-1)p3 + ... = 0 + 0 + (1+p-1)p + (p-1)p2 + (p-1)p3 + ... = 0 + 0 + (p)p + (p-1)p2 + (p-1)p3 + ... = 0 + 0 + p2 + (p-1)p2 + (p-1)p3 + ... = 0 + 0 + 0 +(1+p-1)p2 + (p-1p)3 + ... = 0 + 0 + 0 + (p)p2 + (p-1)p3 + ... = 0 + 0 + 0 + 0 + p3 + (p-1)p3 + ... = ... 0 + 0 + 0 + 0 + .... 0 = 0 So adding 1 to the sequence gives 0, hence what we started with must be equal to -1. For the exercise: 1+2+4+8+...=S1=-1 1-2+4-8+....=S2 S1-S2 = 0+4+0+16+0+64 = 4(1+4+16+...) = 4S3 S1+S2 = 2+0+8+0+32 = 2(1+4+16+...) = 2S3 2(S1+S2)=4S3 = S1-S2 2S1+2S2=S1-S2 S1+2S2=-S2 S1=-3S2=-1 S2=-1/-3 = 1/3
Editor, Skilled player (1938)
Joined: 6/15/2005
Posts: 3243
p4wn3r wrote:
p4wn3r wrote:
So, for this exercise, replace the usual absolute value of the real numbers with p-adic norm for p=2, also known as the 2-adic norm or the dyadic norm. Prove that, with respect to this metric: 1 + 2 + 4 + 8 + ... = -1
Well, it seems no one solved it. It's not very hard, though.
I solved it. I just never bothered to tell you. It is conceptually easy once you understand p-adics.
p4wn3r wrote:
FractalFusion wrote:
You are studying a new strain of bacteria, Riddlerium classicum (or R. classicum, as the researchers call it). Each R. classicum bacterium will do one of two things: split into two copies of itself or die. There is an 80 percent chance of the former and a 20 percent chance of the latter. If you start with a single R. classicum bacterium, what is the probability that it will lead to an everlasting colony (i.e., the colony will theoretically persist for an infinite amount of time)? Extra credit: Suppose that, instead of 80 percent, each bacterium divides with probability p. Now what’s the probability that a single bacterium will lead to an everlasting colony?
Suppose that the probability of an everlasting colony is x. After a time lapse, the bacterium will either split or die. If it splits, we have two bacteria. The probability of both of them not generating an everlasting colony is (1-x)^2, so we'll have an eternal colony with probability 1-(1-x)²=2x-x². Therefore, we have x = p(2x-x²), so either x=0, or 1 = p(2-x) => x = 2 - 1/p Since x is a number between 0 and 1, the second solution only makes sense for p >= 0.5. In particular, for p<0>0 for all N and still satisfy that limit. Now, for p >0.5, I will argue that the correct solution is x = 2 - 1/p. If it were x=0, we would have that the expected number of bacteria after N events is 0, but it's in fact (2p)^N, which does not tend to 0 if p>0.5. So the probability should be positive, and therefore x = 2 - 1/p.
Yes, the probability is 0 if p<=0.5 and 2-1/p if p>0.5. I mentioned that there are paradoxes when p=0.5. This is because, as you said, the expected number of bacteria after N events, starting from 1 bacterium, is (2p)^N. When p=0.5, the expected number of bacteria is 1, which may indicate that the colony may survive with positive probability. Yet, as we have figured out, the colony is guaranteed to die out eventually. Even better, if you take the expected number of splits before dying out, you would expect that a colony dies out if and only if the expected number of splits before dying out is a finite non-negative number. This is false. The expected number of splits before dying out when p<=0.5 is 1/(1-2p): for p=0.5, the expected number of splits before dying out is infinite, despite that it is guaranteed to die out! You can think of this as a kind of random walk; a person standing at the position n and having 0.5 chance each of going to n+1 or n-1 each step, will eventually reach 0, yet the expected number of times the person will increase their position number before reaching 0 is infinite.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Flip wrote:
In general: (p-1) + (p-1)p + (p-1)p2 + (p-1)p3 + ... = -1 To see this, just add 1 to the sequence. [...] So adding 1 to the sequence gives 0, hence what we started with must be equal to -1.
I love arguments like this! It reminds me of an anecdote I heard sometime ago, after seeing the teacher prove 0.999... = 1, asked whether .....99999 = -1, essentially because of this. Of course, his ideas were too deep to be appreciated at that situation.
FractalFusion wrote:
I solved it. I just never bothered to tell you. It is conceptually easy once you understand p-adics.
Cool! I decided to post this problem because I thought the Mathologer answer to the -1/12 video was too shallow. The numberphile calculation is indeed not rigorous, it would require a p-adic completion at "p=1", which we still can't make sense of, but it's far from "nonsense". It's related to the mysterious field with one element, that people have trouble defining, but has the potential to solve lots of problems. It looks like Mathologer has no idea that non-archimedean completions exist, and I don't think it's too much to expect someone to understand a little of the subject they are talking about before dismissing it as nonsense.
FractalFusion wrote:
I mentioned that there are paradoxes when p=0.5. This is because, as you said, the expected number of bacteria after N events, starting from 1 bacterium, is (2p)^N. When p=0.5, the expected number of bacteria is 1, which may indicate that the colony may survive with positive probability. Yet, as we have figured out, the colony is guaranteed to die out eventually.
So, that was the paradox you were mentioning. I understood the p=1/2 case as an instance of gambler's ruin (second bullet point in Wikipedia). ------------------------ Ladies and gentlemen, I present to you my newest integral: Prove it! I am very proud of this one because I finally managed to come up with something that Wolfram Alpha could not find the exact value in its standard computation time. This is getting harder and harder to do as the software evolves.
Editor, Skilled player (1938)
Joined: 6/15/2005
Posts: 3243
p4wn3r wrote:
The numberphile calculation is indeed not rigorous, it would require a p-adic completion at "p=1", which we still can't make sense of, but it's far from "nonsense". It's related to the mysterious field with one element, that people have trouble defining, but has the potential to solve lots of problems.
The way you are describing p-adic completion at "p=1" (???) and the "mysterious field with one element" only makes me think there are better ways to describe ζ(-1). By the way, it is pretty clear that a lot of people see the Numberphile calculation as nonsense. That is entirely on Numberphile. There are ways to present ζ(-1) and other similar series without it coming across as nonsense.
p4wn3r wrote:
It looks like Mathologer has no idea that non-archimedean completions exist, and I don't think it's too much to expect someone to understand a little of the subject they are talking about before dismissing it as nonsense.
Is it not possible to prove that ζ(-1)=-1/12 using analysis in complex numbers only? If it is possible, then I wouldn't expect anyone to bring in p-adics when it is not necessary to do so.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
FractalFusion wrote:
The way you are describing p-adic completion at "p=1" (???) and the "mysterious field with one element" only makes me think there are better ways to describe ζ(-1). By the way, it is pretty clear that a lot of people see the Numberphile calculation as nonsense. That is entirely on Numberphile. There are ways to present ζ(-1) and other similar series without it coming across as nonsense.
I think this all boils down to what "better" is. If the goal is to grade people's answers as right or wrong by applying what's in a book, then I guess it's pretty annoying to have the high schoolers who watch numberphile come up with 1 + 2 + 3 + 4 + ... = -1/12 However, if by "better" you understand something that can potentially solve open foundational problems, then it should not be a problem to communicate speculative ideas. I liked a lot 3B1B's video, especially the part where he says "you are a mathematician, not a robot". In this context, I think the questions raised by Frenkel in the Numberphile video are spot-on. In very different contexts, the sum of natural numbers, or their cubes, and so on, pops up. And in quantum field theory, algebraic number theory, string theory, etc., assigning -1/12 to the series makes everything work. Quantum Field Theory, although not rigorously established, predicts outcomes of physical experiments with an accuracy of one part in one billionth, it's crazy to doubt its validity. By the way, I don't claim that my "p=1" picture is adequate. If I did, I would not post it here, I would submit it as a solution to the Riemann hypothesis. All that I am saying is that it's perfectly reasonable to define something where 1+2+3+4+...=-1/12 independent of the zeta function, since this would explain lots of things. If Mathologer thinks that, in the specific case of zeta(s), this is not possible to do, it's a completely reasonable position. All I can say is that if previous mathematicians assumed that, we would not have foundational results proven, like the finiteness of the ideal class group for a number field, or the Weil conjectures, because their proofs go into an opposite direction. In any case, if he thinks that, he's welcome to do what everyone else does: publish a paper about it.
FractalFusion wrote:
Is it not possible to prove that ζ(-1)=-1/12 using analysis in complex numbers only? If it is possible, then I wouldn't expect anyone to bring in p-adics when it is not necessary to do so.
You know, just as a sanity check, I decided to watch the Mathologer video again and see if he does prove this value of zeta using complex analysis. It turns out that he doesn't. In the part about analytic continuation, he suggests an analytic function defined in an open subset of the complex plane defines it completely. While this is true, it is no guarantee that you can extend it to the entire complex plane. It might happen that the analytical continuation runs into an infinite number of singularities, like the prime zeta function, or modular forms in general, which cannot be continued to the lower half-plane. If he did prove it using complex analysis, why doesn't he mention stuff like the functional equation or the Euler transform of the zeta function, which justify its extension to the whole plane? I think he knows this, as he mentions people will nitpick him to death in the comments. This is plain old hypocrisy for me, requiring others to display an enormous amount of rigor about something that's mostly speculative, while not even bothering to apply correctly the results that are already established.
Player (36)
Joined: 9/11/2004
Posts: 2623
Fun integral. Here's how I approached it. But there's probably an easier way. First, let's get rid of that cosh. Next, do a u-sub so it's a bit nicer to work with. Now we move to complex analysis. Our contour is a semi-circular arc from -R to R in the first and second quadrants (for R > 1). And we're seeking the limit as R -> infinity (Errata: in the first line the bounds should be to R not to infinity.) Finally, compute the residue, and solve for the original integral.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Congratulations! I posted this to some math groups I participate and it seems to have stumped everyone. You are the first to provide an answer. I have two other solutions besides the reduction to the integral of x^(2/5)/(1+x^2), but I'm too busy at the moment to write them down here. Here are the hints for each one if you want to reproduce them. HINT 1: Look at the original integrand. Is it even? Is it periodic in the imaginary axis? Can you solve it with a rectangular contour? HINT 2: Expand cosh(2x/5) as a Taylor series. Look up the Euler numbers
P.JBoy
Any
Editor
Joined: 3/25/2006
Posts: 850
Location: stuck in Pandora's box HELLPP!!!
If you're looking for another one that W|A can't solve: for |α| > 1
Player (36)
Joined: 9/11/2004
Posts: 2623
Is there a particular name for that construct? 1 - 2a cos(x) + a^2? I've been seeing it a lot lately.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
P.JBoy
Any
Editor
Joined: 3/25/2006
Posts: 850
Location: stuck in Pandora's box HELLPP!!!
I'm not sure if it has a name, but it's the cosine law: with b = 1
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
P.JBoy wrote:
If you're looking for another one that W|A can't solve: for |α| > 1
Call the integral I(alpha). First, perform the change of variables x -> pi - x. It follows that I(alpha)=I(-alpha). From this, it is true that I(alpha) = 1/2*(I(alpha)+I(-alpha)). Explicitly: The last expression can be rewritten in terms of the original integral: (1) Another relation is: (2) From(2), it follows that I(0)=I(1)=0. Assume 0<alpha<1, then if we iterate (1): At the limit n-> infinity, the RHS tends to 0. So, I(alpha) = 0 for 0<alpha<1. Now, if alpha>1, we simply apply (2) and conclude I(alpha) = pi log(alpha^2). Since I(alpha)=I(-alpha), that also applies to alpha<-1, and we have found the answer.
Player (36)
Joined: 9/11/2004
Posts: 2623
A math.se user found a particularly elegant solution to p4wn3r's integral involving the Beta function.
Build a man a fire, warm him for a day, Set a man on fire, warm him for the rest of his life.
Banned User, Former player
Joined: 3/10/2004
Posts: 7698
Location: Finland
p4wn3r wrote:
Cool! I decided to post this problem because I thought the Mathologer answer to the -1/12 video was too shallow.
The sum of all natural numbers is not -1/12, because it's actually -1/8. (I'm quite certain that you can make it equal to whatever number you want.)
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Warp wrote:
p4wn3r wrote:
Cool! I decided to post this problem because I thought the Mathologer answer to the -1/12 video was too shallow.
The sum of all natural numbers is not -1/12, because it's actually -1/8. (I'm quite certain that you can make it equal to whatever number you want.)
I'm also quite certain that you will ignore what I am writing, but I'll do so anyway. In most theories where people try to find axioms where the sum can be defined as convergent, the structure where they are summed over is a monoid that induces a gradation in the summed elements, so this gobbling up of element that he does is not allowed.
Player (96)
Joined: 12/12/2013
Posts: 376
Location: Russia
This kind of proofs works if limit exists. They based on some properties of limits, like limit of sum of sequences, and reordering of summation. And here is interesting thing. This means, that in any other system where limit exists, and it doesn't equal to -1/8 for example, then some of basic properties should also be violated.
Banned User, Former player
Joined: 3/10/2004
Posts: 7698
Location: Finland
Wouldn't it be a contradictory concept to state that a non-convergent sum converges to a finite value? It's either convergent or non-convergent. It cannot possibly be both at the same time.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Warp wrote:
Wouldn't it be a contradictory concept to state that a non-convergent sum converges to a finite value? It's either convergent or non-convergent. It cannot possibly be both at the same time.
Wow, really? Are you reading ANY of my posts now? In any case, no, it's not, as was discussed a few posts ago, to discuss convergence you need the concept of a metric, which tells you which numbers are close together. In the question I asked, I gave the example 1 + 2 + 4 + 8 + ..., which diverges in the usual metric, but converges to -1 in the 2-adic metric. Watch the 3blue1brown video if that helps (which I also linked before, by the way). The discussion that we had, which was totally reasonable, was whether it's possible to arrive at the result 1+2+3+...=-1/12 without complex analysis, which I think is totally possible. There is this derivation, common in physics, where one introduces an exponential regulator. It's possible to define a metric where functions that differ by a pole of the form 1/epsilon^2 are very close together, so that the summation does, in fact, converge to -1/12. The introduction of the exponential term also explains why blackpenredpen's way of summing terms gives a different value. His method relies on the fact that the average of an odd number of consecutive integers is the integer in the middle. Once the terms are modified with the regulator, that property is no longer true, and this method, in particular, fails. The right way to do this without invoking the regulator is to use a graded ring, which is a more advanced concept.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
More abstract nonsense. Now we're gonna start having fun with arrow chasing!
p4wn3r wrote:
A category consists of a collection of objects, and for each pair of objects, a set of morphisms between them. The collection of objects of a category C is often denoted obj(C), but we will usually denote the collection also by C. If A,B are in C, then the set of morphisms from A to B is denoted Mor(A,B). Morphisms compose as expected. If f is in Mor(A,B) and g is in Mor(B,C), there is a morphism g o f in Mor(A,C). Composition is associative. For each object A in C, there is the identity morphism idA: A -> A, such that every morphism that's composed with it, either left or right, returns the same morphism. A morphism f is called an isomorphism if there is a (necessarily unique) morphism g such that both f o g and g o f are the identity morphism.
We need more definitions for this exercise. A (covariant) functor is a map between categories that preserves morphisms, that is, a functor F: C -> D between categories C and D associates to each object X in obj(C) an object F(X) in obj(D), and for each morphism f in Mor(X,Y) a morphism F(f) in Mor(F(X),F(Y)) such that the following holds: (1) F(idX) = idF(x) (2) F(g o f) = F(g) o F(f) Functors, of course, compose with each other. If we have F: C -> D and G : D -> E, the functor G o F: C -> E, exists. There's also the identity functor I : C -> C A natural transformation is a map between two functors F,G: C->D, that "preserves internal structure". Formally, a natural transformation n: F -> G associates to every object X in obj(C) a morphism nX: F(X) -> G(X) between objects in D, such that: (*) For every morphism f: X->Y in C, we have nY o F(f) = G(f) o nX A better way to visualize this condition is using a commutative diagram, where every composition of morphisms along a path connecting a pair of vertices is understood to be equal: Now, for the problems. It might help to represent diagrams for natural transformations in the style of the Wikipedia article: (1) If we have two natural transformations n: F -> G and m: G -> H, where F,G,H: C -> D (that is, all of them are functors from category C to category D), show that we can compose them, and that composition has the structure of a monoid (it's associative and has an identity). This is the so-called vertical composition of natural transformations. (2) If we have a natural transformation a: F -> G, where F, G: C->D, and a functor H: D -> E, show that we can find a natural transformation from H o F to H o G, called the right whiskering of a with respect to H. (3) Similarly, show that if we have a natural transformation b: G -> H, where G,H: D -> E, and a functor F: C -> D, show that we can find a natural transformation from G o F to H o F, called the left whiskering of b with respect to F. (4) Finally, if we have two natural transformations a: F-> G and b: H -> K, where F,G : C -> D and H,K: D-> E, show that we can define the horizontal composition b o a : H o F -> K o G in two ways: (a) Left whiskering followed by right whiskering (b) Right whiskering followed by left whiskering Also, show that the definitions (a) and (b) define the same natural transformation. Bonus: why are they called vertical and horizontal composition?
Banned User, Former player
Joined: 3/10/2004
Posts: 7698
Location: Finland
p4wn3r wrote:
The discussion that we had, which was totally reasonable, was whether it's possible to arrive at the result 1+2+3+...=-1/12 without complex analysis, which I think is totally possible. There is this derivation, common in physics, where one introduces an exponential regulator. It's possible to define a metric where functions that differ by a pole of the form 1/epsilon^2 are very close together, so that the summation does, in fact, converge to -1/12.
If the sum of the natural numbers is -1/12 according to these summation methods, why is the sum of the reciprocals of the natural numbers infinity? (If you could keep the answer as close to arithmetic as possible, I would be grateful.)
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Warp wrote:
If the sum of the natural numbers is -1/12 according to these summation methods, why is the sum of the reciprocals of the natural numbers infinity? (If you could keep the answer as close to arithmetic as possible, I would be grateful.)
Well, the method that sums the natural numbers to -1/12, Ramanujan summation, actually sums the reciprocals of the natural numbers to the Euler-Mascheroni constant. Just so we're in the clear, I will not attempt to explain why only in terms of arithmetic for the following reason: there seems to be a widespread misconception that everything mathematicians do should be neatly expressible using numbers, addition, and multiplication. This is completely false, it happens a lot that the methods used to derive results can be more interesting than the results themselves, so the more advanced results go completely beyond numbers and actually it's very hard to see why they're relevant if you express them in elementary terms, they just look like a garbled mess of propositions. Incidentally, that's how I make exercises intended to be hard, like the integral I posted before. I start with some very abstract notions and specialize to more elementary things and see what the abstract notions imply. When I get something that's very had to see with elementary methods, I propose as a hard exercise. If the topic was something like: why is the golden ratio the solution to the equation x^2=x+1, then certainly I would stay close to arithmetic. But that's not what the topic is about, the topic is about why it might make sense to assign 1+2+3+4+... to the value -1/12, and specifically how blackpenredpen's three-line calculation fits into that. I answered a technical level that I think is appropriate for this topic. To appreciate it, you of course need to read a considerable portion of the literature and understand some definitions, which takes even some proofs to show they make sense, and a lot of knowledge to contextualize them. Doing all of this looks intimidating at first, but all the time I am very surprised by smart people that manage to do it. If, of course, it looks superfluous or uninteresting, then the answer to the original question would look just as superfluous or uninteresting.
Banned User, Former player
Joined: 3/10/2004
Posts: 7698
Location: Finland
p4wn3r wrote:
Just so we're in the clear, I will not attempt to explain why only in terms of arithmetic for the following reason: there seems to be a widespread misconception that everything mathematicians do should be neatly expressible using numbers, addition, and multiplication.
I did not ask because I think that everything should be expressible using arithmetic. I asked because my strongest knowledge and experience of mathematics is, essentially, high-school level math, or what I would call "practical math" (arithmetic, elementary algebra, analytic geometry, basic trigonometry). Once you start throwing integrals, derivatives and infinite summations and products into the mix, it starts going over my head fast. (I know how to derive most stuff, and I remember something from high-school and university integration lessons, but that's about it.) In other words, a very esoteric answer using advanced postgrad math wouldn't be very useful of an answer for me, personally.
Skilled player (1404)
Joined: 10/27/2004
Posts: 1976
Location: Making an escape
I'm in a similar boat with Warp. One of my biggest regrets in life is not taking any formal math courses beyond college level algebra. Regardless, I love the subject and am always trying to learn more about it and find myself completely humbled to look in this topic. Then I go to work where "Subtract x from both sides" is practically god-tier skills. On the present subject, I do remember tooling around on Wolfram and discovering that the limit of the real part of Zeta(1) as you approach from the direction of i is the EM constant. Doesn't surprise me as Zeta(1) is used in defining the constant in the first place, but both real and imaginary parts seem to shoot off to infinity if approached from any other direction (barring the real directions, where the imaginary part is 0), which in my mind creates a very weird looking pole. Now clearly, actually understanding why this is the case is beyond me, but I do have to wonder, is this fact part of assigning the EM constant to Zeta(1)?
A hundred years from now, they will gaze upon my work and marvel at my skills but never know my name. And that will be good enough for me.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
Ferret Warlord wrote:
Now clearly, actually understanding why this is the case is beyond me, but I do have to wonder, is this fact part of assigning the EM constant to Zeta(1)?
Pretty cool that you're exploring these things! One way to understand it is to simply plug the harmonic series into the definition of Ramanujan summation. You end up with the difference between an integral of 1/x, which gives ln n, and the n-th harmonic number, this is pretty much the definition of the constant. To link with the zeta function, the idea is that it can be expanded as a Laurent series at s=1 like this: zeta(s) = 1/(s-1) + gamma + O(|s-1|) The first term obviously gives rise to a singularity at s=1, but there's a rigorous way of treating this pole as an "infinitesimal", it's pretty similar to the p-adic metric I talked about earlier. So, in this sense, you could write zeta(1) "=" gamma
Editor, Skilled player (1938)
Joined: 6/15/2005
Posts: 3243
Edit: Some edits to this post. Since we've been talking about infinite series a lot recently, the Riddler Classic problem this week might interest you: (My edits in square brackets to hopefully clarify things) ---------------------------------------------------------- [A] tortoise and [a] hare are about to begin a 10-mile race along a “stretch” of road. The tortoise is driving a car that travels 60 miles per hour, while the hare is driving a car that travels 75 miles per hour. [Assume both cars instantly attain those speeds during movement.] The hare[, wanting to end at the same time as the tortoise,] realizes if it [were to wait] until two minutes have passed, they’ll cross the finish line at the exact same moment. And so, when the race begins, the tortoise drives off while the hare patiently waits. But one minute into the race, after the tortoise has driven 1 mile, something extraordinary happens. The road turns out to be magical and instantaneously stretches by 10 miles! [The road stretches linearly, taking whatever is on the road with it, including the hare, tortoise, and finish line.] As a result of this stretching, the tortoise is now 2 miles ahead of the hare, who remains at the starting line. At the end of every subsequent minute, the road stretches by [an additional] 10 miles [in a similar fashion]. With this in mind[, how] long after the race has begun should the hare wait so that both the tortoise and the hare will cross the finish line at the same exact moment? ---------------------------------------------------------- Hint: You don't actually need to find how long it takes for the tortoise to finish the race to answer this problem.
Player (42)
Joined: 12/27/2008
Posts: 873
Location: Germany
I might as well post the solution to my problem already. In my opinion, it's the first result in category theory that's difficult to see without resorting to it, and helps abstract lots of proofs. The trick is, for very abstract statements like this, for them to be true, the statement has to follow from piecing together the functions in the problem statement, and quite often you can do this by only "chasing" the types. I'm taking the yellow images from here:
(1) If we have two natural transformations n: F -> G and m: G -> H, where F,G,H: C -> D (that is, all of them are functors from category C to category D), show that we can compose them, and that composition has the structure of a monoid (it's associative and has an identity). This is the so-called vertical composition of natural transformations.
(Replace alpha with n, beta with m) Given this diagram, how could we possibly construct the natural transformation? For each object X in obj(C), we must associate a morphism. We have two morphisms available, nX from the transformation n, and mX from the transformation m. If we write them as a commutative diagram: The first square commutes by the naturality of n. The second commutes by the naturality of m. Thus, the whole diagram commutes. If we think of the composition mX o nX, it satisfies the requirements for a natural transformation. So, to build the vertical composition, we simply compose the morphisms. It clearly has an identity (take the identity morphism) and is associative, because morphism composition is associative.
(2) If we have a natural transformation a: F -> G, where F, G: C->D, and a functor H: D -> E, show that we can find a natural transformation from H o F to H o G, called the right whiskering of a with respect to H.
This time, we have a morphism and a functor. The key insight is that the morphism aX maps objects in the category D and we can apply the functor H to it to make it map objects in E. So, we apply H to the whole square of the natural transformation: By looking at this square, it's clear from the definition that H o a defines a natural transformation from the functor H o F to the functor H o G.
(3) Similarly, show that if we have a natural transformation b: G -> H, where G,H: D -> E, and a functor F: C -> D, show that we can find a natural transformation from G o F to H o F, called the left whiskering of b with respect to F.
Completely analogous, just write down the square for the natural transformation, and make it explicit that objects from D come from the functor F applied to an object in C: From this diagram, it's clear that b o F defines a natural transformation from G o F to H o F.
(4) Finally, if we have two natural transformations a: F-> G and b: H -> K, where F,G : C -> D and H,K: D-> E, show that we can define the horizontal composition b o a : H o F -> K o G in two ways: (a) Left whiskering followed by right whiskering (b) Right whiskering followed by left whiskering Also, show that the definitions (a) and (b) define the same natural transformation.
Just look at the types. (a) We start with b from H to K, and forget the functor G. If we left whisker, we have the natural transfomation b o F from H o F to K o F. Now, if we start with a form F to G and ignore the functor H, we can right whisker and find the natural transformation K o a from K o F to K o G. Finally, we can vertically compose them to find: (K o a) "o" (b o F): H o F -> K o G Notice that here o denotes functor composition, while "o" denotes vertical composition of natural transformations, so I denote them differently, because they are different transformations. (b) Do the same thing as (a). Start with a and forget the functor K. Right whisker to obtain H o a: H o F -> H o G. Now, start with b and forget the functor F. Left whisker to obtain b o G: H o G -> K o G. Compose them vertically to find: (b o G) "o" (H o A): H o F -> K o G How do we prove these are the same thing? There are several commuting diagrams you can draw. I don't like the one on the slides I got the yellow images from. The one I like is: You can think of it as a cube. Naturality and functoriality makes everything commute. All arrows from one square to the other are the functor applied to the morphism f, I didn't write it because it would be a mess. The construction in (a) amounts to the path H(F(X)) -> K(F(X)) -> K(F(Y)) -> K(G(Y)), and the construction in (b) is simply H(F(X)) -> H(G(X)) -> H(G(Y)) -> K(G(Y)). Since the whole thing commute, these two things are equal.
Bonus: why are they called vertical and horizontal composition?
Compare the diagram in (1) to the diagram in (4). In (1), the transformations that are composed are drawn one above the other, while in (4), they are written next to each other. Because of that, (1) is vertical and (4) is horizontal. Of course, this is merely convention, and you can obviously draw things another way. However, these names have stuck.