In a Prisoner Dilemma problem, each agent has to choose between an action A that would benefit itself by a certain amount X, or an action B that would benefit each individual in the group by less than X. However, if most of the agents choose their best option, A, no one would benefit, or they could even get damaged.
There are several example of how to make this less abstract, but I will use an uncommon one. Let's say that you have to choose between advancing your career in a selfish and dodgy way, hurting several people in the process of getting to the top. The other option is to painstakingly treat everyone in a nice way, never step on anyone's toe, and try to do as good of a career as you can within these limits. Let's say that you `can be pretty sure that, with the first option, you will get to the top earlier. What would you do? What should you do? What do you think anyone should do?
Intuitively, we know that one of the option is morally wrong, the other one is morally right, or neutral. We know which one is which because we feel it, and we don't have to think about it.1
It also happens that the immoral option is the most rational, in terms of evolutionary fitness. It doesn't make sense to get the longer route, suffer through it, and maybe not even getting to the same results, where I can be on a better position faster, without hurting my chance of finding a suitable mate for my offsprings (maybe not within the pool of people I have hurt - but I will have access to another pool of people, more powerful and thus more convenient, evolutionistically speaking).2
This same reasoning holds for everybody, but if everybody would do that, we would leave in a horrible world where everybody hurt each other for their own benefit. This, like many moral problem, is a Prisoner's Dilemma problem (from now on, PDp)3.
Across history, there must have been groups that consistently tackled the PDp by chosing the most individually convenient action. We are not one of those groups. Those groups are probably extinct or evolved into something different, as their actions would in the long run damage the group itself, and would make any civilzation impossible.
We, as a human speces, have mostly solved these types of problems through coordinate signalling. We have developed a way to signal to each other that someone is solving a PDp in an individualist, group-hurting way. The signals lead to a punishment: ostracizing, imprisonment, etc. These signals are mostly aimed at other agents in the group. At a certain point, however, it just becomes convenient to aim them at ourselves: we don't want to be the target of retaliation, we want to prevent punishment, and thus we need to automatically tell ourself what is the best thing to do to solve PDp. But, watch out! The best thing to do in this case is the opposite of what you would do if you were a rational individualistic agent. Thus, this feeling has to be an innate and irrational (it has to come from your gut and not from your head) because it goes completely against our evolutionary drive of doing the best thing for ourselves.
So we send signal to ourselves to avoid punishment. You also know that a signal of "you are doing something PDp-wrong" (as in something would hurt your group and benefit yourself) is most likely going to be followed by a punishment. When you consistently associate a signal to a certain punishment, the signal becomes the punishment itself. In this way, signalling that someone is doing something PDp-wrong is a way to punish them, and people have developed ways of efficiently signalling each other. Internally, people can signal+punish themselves with a sense of guilt for making a PDp-wrong action. Externally, they can use a variety of techniques, such as social shaming. If this seems absolutely horrible for you, imagine a society where this doesn't happen. If you take the signalling out of the equation, the set of people that solves PDp in an individualistic way will take over, and this would the horrible for everyone.
So we developed signals for indicating actions that are good individually but bad for the group. These signals are associated with punishment, and they are "punishing" themselves. They can be targeted at each other, but it soon becomes convenient to target them at yourself as well, for preventing group retaliation. Put all of this together, and you get a morality system.
Morality is the way humans solve Prisoner's Dilemma problems. A moral problem is a Prisoner's Dilemma problem.
Why the most rational action in moral problems is to behave immoraly? Because morality has been developed precisely to prevent people to behave rationally in moral problems.