Replica Method for Boolean Perceptron with Continuous Weight

An brief derivation for the asymptotic generalization error of a Boolean perceptron with continuous weights using the replica method. This problem was also the final assignment in the ‘Statistical Mechanics of Neural Networks’ course at SYSU, Fall 2024.

1. Question

Consider the asymptotic form of the generalization error of a Boolean perceptron with continuous weights in the large- limit.1

2. Setup

We consider the perceptron model

where the weight vector satisfies the spherical constraint , and the input .

The training set is , with labels

where the teacher weight satisfies , and the data are i.i.d. Gaussian .

The loss function is

where is the Heaviside step function.

The generalization error is

3. Solution

Define . The generalization error can be written as

where is the fixed point of

As , the asymptotic behavior is


4. Derivation

4.1. Statistical Mechanics Formulation

The partition function is

Noting that , we take the zero-temperature limit and use

which corresponds to the noiseless-output setting.

Define the free energy . Using the replica trick,

the free-energy density becomes

Using , introduce auxiliary fields

Then one can rewrite the free-energy density as

The fields with

Introducing order parameters into the free energy yields

where

In the large- limit, evaluate by saddle point:

4.2. Replica Symmetric Ansatz

Assume replica symmetry:

Then Eq. (17) becomes

Using the Hubbard–Stratonovich transformation

one obtains the reparameterization version of Eq. (18):

and futher calculating the Gaussian integral yields

Using symmetry between and , together with the reparameterization

Eq. (19) becomes

where the following definitions were used

Therefore all three terms under the replica symmetric ansatz are

and the disorder-averaged free-energy density becomes

4.3. Saddle-point Equations

Letting gives

Substituting into Eq. (35) yields

Setting leads to

The solution satisfies . (Intuitively, and are both uniformly random on the sphere , so their overlaps are not expected to differ.)

Using the change of variables

one can rewrite the fixed-point equation as

Iterating this equation to convergence gives the value of .

4.4. Generalization Error

Using the re-parameterization

and the identity

one can rewrite the generalization error as

Considering when , . Let with . Substituting into Eq. (41) gives

Using , one obtains the asymptotic form of the generalization error:


Footnotes

  1. This problem was first solved under the replica symmetric ansatz by G. Györgyi and N. Tishby (1990). It was also discussed in the perceptron work of H. S. Seung, H. Sompolinsky, and N. Tishby (1992), and in Chapter 8 of H. Nishimori’s book (2001), which is a highly valuable reference.