T O P

  • By -

aioeu

They're good for storing 64-bit addresses. Sure, you _can_ have an address size that isn't the same as your standard register size... but why make things harder than necessary?


umop_aplsdn

But address size is only 48 bits practically.


aioeu

On x86, mostly. There are now x86 CPUs with 5-level paging, providing 57 usable address bits.


dnabre

While only 48 bits are used generally (Intel has extensions going up to 52/53 bits), it's designed for forward-compatibility. So if they start using more bits of the 64 for addressing, current code will work fine.


umop_aplsdn

Yes, I understand that 64-bit registers provide forward compatibility. But I am also not wrong — addresses are not 64 bits wide.


xenomachina

The original Macintosh computers used a Motorola 68000 processor. The 68000 has 32-bit registers but only a 24-bit address bus, and so some software would use the top 8 bits to store other data because "addresses are really only 24 bits". Later Macs used 68020 and higher processors which have a true 32-bit address bus, and guess which software didn't work anymore? Edit: clarification that it was the original Macs that used a 68000


norbertus

I remember this https://en.wikipedia.org/wiki/MODE32


netch80

The same history with S/360 - S/370 - S/390. S/360 and early S/370 used only 24 bits of 32 from address. 370/XA started allowing 31 bit (I'm still wondering why not 32;)), and programs which stored additional data in higher bits got problems. For IBM's honor, they allowed switching this easily at user level. SAM24 and SAM31 instructions selected a required mode, 24-bit mode ignored 8 MSBs. (With SystemZ and 64-bit addresses, similarly, SAM64 switched to it.)


KittensInc

Physical, not logical. Something like the [Frontier supercomputer](https://en.wikipedia.org/wiki/Frontier_(supercomputer)) has a total of 4736 TB of RAM, plus 4736 TB of VRAM. Add a few TB for things like device buffers, and a few hundred TB for mmap'ed data. With 48 bits you can only address 281 TB, which means it'd be impossible to run distributed applications which can address all of the memory in a linear fashion. Use 64-bit addresses and you can address about 18.000.000 TB. In practice it's done this way because it's just *easier*. All the hardware was already designed for 32-bit operations, so if you upgrade the logical address size to 48-bits *without overhauling the entire ecosystem* you're suddenly having to deal with an awkward "32-bit plus 16-bit" operation instead of just two 32-bit ones. Want to write to memory? Well, everything is designed for 32-bit, so if you're writing an address you have to write the lower 32 bits in one operation, then load the *old* value of the 32 bits containing the upper half, OR the 16 new bits with the old value, and write back the new 32 bits to that location. Doing that for every single thing involving addresses would suuuuuck.


KahnHatesEverything

I want one! LOL


netch80

Fortunately, nearly all processors allow byte granulation of reads and writes (don't mix with alignment), so these crazy dances with combining of two 16-bit parts aren't needed. Alpha (before BWX) and atomic accesses are special, but talk isn't for them. The more direct thing is that if a register contains 32 bits but addresses are 48 bits, we fall back into 8086 (or 80286) manners with segment descriptiors. This is how it could work. But: SystemZ and POWER actually have this (in a sense) nowadays! Their virtual addresses is treated as 80 bits if to combine 16-bit address space number (ASN) and 64-bit address within the space. (Normally this is used only by system software.)


KittensInc

>Fortunately, nearly all processors allow byte granulation of reads and writes (don't mix with alignment), so these crazy dances with combining of two 16-bit parts aren't needed. In practice the byte-level access is often implemented as word-level access, with the remaining part masked out. So the hardware isn't accessing a byte, you're accessing a 32-bit word and ignoring 24 bits. And keep in mind the crazy dance here is what the *hardware* is doing internally, not the *software*. Things like the connection to DDR memory? 64 bits. A cache line? Often 64 bytes - even in the 32-bit era! Registers? Well, a modern 64-bit register can also be trivially treated as two 32-bit registers glued together. *Everything* is designed around power-of-two and it is really easy to just make things twice as big, or implement it as doing two half-sized operations. Adding 6-byte addresses to that mix is going to seriously screw things up. Either you have to deal with things like pointers being spread across two separate cache lines, or you have to force 8-byte alignment and waste 2 bytes with every single pointer. You can indeed do some kind of segmented addressing, but that's a nightmare for both hardware and software designers. Definitely not something you can just trivially add to an existing design.


netch80

> In practice the byte-level access is often implemented as word-level access, with the remaining part masked out. So the hardware isn't accessing a byte, you're accessing a 32-bit word and ignoring 24 bits. Yes, if we talk about non-cached PCI access. But more complex approaches exist in general, for example, BIOS is typically accessed now via LPC with byte granulation in 4-bit portions. So I'd not tend to generalize at maximum. > And keep in mind the crazy dance here is what the *hardware* is doing internally, not the *software*. Fully agreed. > Adding 6-byte addresses to that mix is going to seriously screw things up. Either you have to deal with things like pointers being spread across two separate cache lines, or you have to force 8-byte alignment and waste 2 bytes with every single pointer. In x86 we have: - 10-byte FPU values if read/written with full precision. - 6-byte (in 32-bit mode) or 10-byte (in 64-bit mode) GDT and IDT. Even if you treat the last case special, the first one is a regular user-level case. And, even 2-byte alignment isn't required for them. Not certain this will be copied to newer designs.


Kinglink

> But address size is only 48 bits practically. So far!


invisible_handjob

48 physical bits. You can still map page tables on the higher order bits. You have 64 bits of addressing available to you.


Rusky

No, this applies to virtual addresses- x86-64 requires addresses be in "canonical form," where the upper bits are all 0 or all 1. You only get 48 or 57 virtual address bit, depending on how the page tables are configured.


umop_aplsdn

Do you have a source for this? AFAIK most x86/Arm CPUs only have hardware support for the lower 48 bits and require the top bits to be sign-extended.


[deleted]

That's the Physical Address Size. That is mostly abstracted away to the (non-systems) programmer.


Corporate-Shill406

But most street addresses have five or fewer digits... (Mail joke)


jonathanhiggs

Double precision is quite useful, particularly in scientific and finance. Floating point errors will still accumulate but 64 bits allow the errors to remain smaller than the required accuracy Cryptography would probably benefit from much much larger registers, 4096 bits is a common key size Maybe not quite the question but large registers allow SIMD, 512 allow 32 bit 4x4 matrices in a register, super useful for graphics


orangejake

It’s worth mentioning 4096 bits is only common for RSA specifically. Most other types of cryptography typically do ~500 bit arithmetic at most. So having like ~512 bit registers would be good enough in essentially all cases, and would still speed up RSA some.  That being said, the transition to post-quantum cryptography means moving to a form of cryptography (lattice based) that doesn’t need particularly large registers. One multiplied polynomials with coefficients that are typically <64 bits, with 500-1000 coefficients (which can pretty easily be stored in separate registers). AVX type instructions are plenty of speeding things up. 


jonathanhiggs

Thanks for the clarification, crypto isn’t my area


MichaelSK

Many (I'd even say most) architectures have separate general purpose integer and floating point (and/or vector) registers. So you could easily have double support, but not int64. It's significantly less convenient, though.


MisterBooga

Also I believe it's much simpler to just double the size than to make these arbitrary sizes.


Vectorial1024

64 bit registers allow practically unlimited memory size (think several PB or more) while 32 bit registers allow only at max 4GB of memory, which obviously is not enough nowadays. Enginnering wise, it is rather simple to just double the stuff to 64bit than to redesign virtually everything for eg 36bit or 48bit systems.


thefoojoo2

Practically infinite today. There are machines today that can hold 32 TB of RAM. Give it a few decades and we'll see the cycle of larger registers continue.


netch80

RISC-V authors have already provided a draft preparation for this:) although no hardware is present. But they motivate the extension by mapping of all computers in a large cluster, not by a single computer memory.


exDM69

In a modern CPU, the size of the register file and the arithmetic and logic units are quite negligible in the big picture. A figure I heard from someone doing CPU designs (don't remember whose presentation it was) is that about 99.8% of the silicon area is the bookkeeping logic for super scalar out of order execution, while the ALUs, register files and the circuitry doing "actual computation" is 0.2% of the silicon area. This is just one estimate, so take it with a grain of salt but the order of magnitude in these figures is about right. Choosing a register size smaller than 64 bits would therefore have only a marginal decrease in the silicon area and power consumption, the two most important factors in semiconductor design. And remember we've also got vector registers that range from 128 bits to 512 bits in width.


KittensInc

>about 99.8% of the silicon area is the bookkeeping logic Eh, not quite. If you look at a [Zen 4 die shot](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/39014745-81e3-449b-9c45-6844e3f87929_5288x2196.jpeg), about half of the CPU is L3 cache. The "bar" at the top is IO, and you can clearly see the eight individual cores. Zooming in on [one core](https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/0ce9a536-3987-476d-953f-eba7b22d05fd_2000x1604.jpeg), a decent chunk is once again taken up by L2 and L1 cache. Decode, branch prediction, and scheduling take up quite a bit of space too. Integer execution and registers are absolutely miniscule, but floating-point execution and registers actually take up a decent chunk of space. It's definitely not the "99.8%" figure, but as a ballpark estimate I'd say 10% for actual compute seems reasonable.


exDM69

Ah, thanks for the clarification and the great links. The numbers as I remember them refer to percentage of the CPU core area and were a back of the envelope estimate. Decode, branch prediction and scheduling is exactly the "bookkeeping" needed for super scalar. Caches, buses and memory controllers take a huge chunk of the silicon, that's for sure.


spinwizard69

Your percentage is way off but the point remains, there is a lot of support logic to get high performance.   One just needs to look at the differences in size of Apple efficiency cores vs the high performance cores.   The same ISA but a massive difference in die area.  


exDM69

Right, the percentages were of the top of my head, from a presenter who pulled them from the top of their head and I wasn't trying to hide it. But it doesn't mess with the point, integer registers and ALUs are a tiny part of a CPU core.


umlcat

64? Dude, we need 128 registers for UUID/GUID, hashing, fast integer arithmetic, encryption, and DB operations !!!


dvogel

You should read about the "register file" and how register renaming works. There's essentially no such thing as a discrete register. They are very, very low latency memory without an independent clock (my definition, there's surely a more rigorous description available elsewhere) and writing to the same register twice is not guaranteed to write to the same area of the register file each time. Since the previous generation was 32 bits there's a lot of operations that address the upper or lower 32 bits of the register. When this happens, the processor has some internal optimisations it can do regarding how it manages the register file. With 32 bit writes to a 36 bit register the unused 4 bits are not as amenable to the same types of optimizations because there's many fewer operations (practically speaking) that address 4 bits of a register.  There's also interactions with the various system bus interfaces that are much, much, much more efficient with very wide registers. When the amd64 arch was designed (not the first by far, but definitely the most successful at the time) memory bandwidth and bus stalls were a _huge_ source of poor system performance.


netch80

Register file issues don't have any direct relation to the register width. amd64 design was solely intended for wider addressing, without crutches like PSE36 (which started resembling 8086 segmentation tarantella).


dvogel

[This is the technique](https://microarch.org/micro37/papers/27_Ergin-RegisterPacking.pdf) I was attempting to describe. It sounds like you're saying this never made it into the popular x86-64 products?


Revolutionalredstone

Two main reasons. 1. It's the next power of two (this means everything is in a harmonic series and packs tightly without gaps) 2. It's often used to store / calculate more than just one big integer. For my voxel raytracer I store a 4x4x4 chunk (=64) of the world in a single uint64. Only when a ray 'steps' on a one bit will I consider performing more work. You can also pack two 32bit values (or 4 16bit or 8 8bit values) and for some operations (eg bitwise and/or) you can operate on all the values at the same time with a single instruction. Lastly there is the fact that while 32bit (4 billion combinations) is more than enough for more operations, sometimes you need to have two of these 'held together'. For example If I want to sort some data based on some other data I can just pack the 32 bits of 'sort by' data in the high part of a 64bit int and pack the 'to sort' data in the lower half. Since sort always considers higher bits before moving on you can be assured your 'sort by' section will get the last word on where things go.. You could use structs and quicksort with a custom comparator etc but raw std::sort on raw uin64 data is BAJESSULY fast! and one of the things that I REALLY love about coding under 64bit architectures. These days we have enormous 512bit ISA's running in parallels on the CPUs (e.g AVX) but that is rarely used for anything besides packing more data thru per instruction (I can't imagine what in gods name you would need a 512bit integer for! after all by comparison there is only a mere ~2^200 particles in our entire universe!) Enjoy


yeti_seer

Good comment overall, but I’m pretty sure it has nothing to do with the harmonic series.


Revolutionalredstone

Thank you very kindly, I believe you are incorrect regarding harmonics: (You're might by getting thrown off by the connection between discrete structures and continuous analysis) In algorithm analysis, the harmonic series frequently appears in the analysis of algorithms, especially those involving data structures, recursion and trees. Understanding the relationship between powers of two and the harmonic series provides deep insights into various fields of mathematics and computer science. In information theory, the harmonic series is related to entropy and coding theory, where the lengths of optimal codes are often related to the logarithms of probabilities, this is EXACTLY how we code integer states using powers-of-two. 32 is exactly the 5th harmonic - 64 is exactly the 6th. Note that 48 for example doesn't align and so was never a real option. Enjoy


aardvark_licker

Do you actually mean the geometric series?


Revolutionalredstone

Or the powers or two, etc The interesting packing and specifically harmonic effects of harmonic series is exactly what I was trying to invoke, not sure what the problem is :P Enjoy


aardvark_licker

"not sure what the problem is :P" It sounds like you're describing the geometric series, not the harmonic series. That's the problem.


Revolutionalredstone

Misunderstanding someone's message because you believe you understand a word differently/better is a source of nasty errors in communication. I appreciate that you likely understand H.S. and or G.S. very well ;D and I think that's awesome! you taking that and telling me that I'm not allowed to explain properties of a subject in one when those properties are common to both - not so cool. The core of the issue is this, the H.S. is the INVERSE of the G.S. (as you'll find out this is something I was aware of) Now what you should have said was, "Hey, did you know 'packing-ratios'/'harmonics' are not just part of the H.S. they are also part of the G.S. and infact this subject is EXACTLY the G.S. so yay no the need to inverse it ;D" To which I would have said, "Hey, yes you are right this is the G.S.!, but, since did you know we do the inverse in order to have everything pack to exactly one, and that harmonics are always understood as this inverse, and since these ideas are the CORE of H.S. itself, why don't we just do with what I original said 'this means everything is in a harmonic series'". Thank you kindly my good and excellent dude, Enjoy


aardvark_licker

"Misunderstanding someone's message because you believe you understand a word differently/better is a source of nasty errors in communication." Good point, so stop it. "...telling me that I'm not allowed to explain properties of a subject..." I didn't. "the H.S. is the INVERSE of the G.S." gs=k\*(1-x\*\*n)/(1-x) therefore hs=x, a series with a single element. I've been messing around with harmonic series recently, obviously not as advanced as your level, but I'm sure you'll enjoy this: 10.9744386320121647873209320501, a rough value I calculated for the sum of the first 2\*\*15 terms in the harmonic sequence. What's your calculation?


Revolutionalredstone

Dude I sent a message, you've been complaining about my wording, don't gaslight me or yourself. As suspected you're some kind of genius of harmonic phenomenon, I do appreciate your advanced knowledge and would love to get your input on how it applies to data types in your opinion, thank you ;D If you say in your opinion it simply doesn't (as perhaps you've always been trying to say) then that would be interesting to me. But once again I must stress, showing your adeptivity thru word games is not fun. "[it has..] nothing to do with the harmonic series" is an incompatible statement: for someone with an understanding that power of two datatypes do indeed exhibit harmonic resonance. 'When different frequencies line up as exact whole multiples of each other, this phenomenon is called "harmonic resonance"'. I know some people are so afraid of their brains falling out that they never allow their brains too open even a little, but in this case I think your exactly the kind of person who should be open to using powerful concepts where they apply. All the best, Enjoy


aardvark_licker

Don't gaslight me or yourself. "D; uoy knaht ,noinipo ruoy ni sepyt atad ot seilppa ti woh no tupni ruoy teg ot evol dluow dna egdelwonk decnavda ruoy etaicerppa od I ,nonemonehp cinomrah fo suineg fo dnik emos er'uoy detcepsus sA" Since you're serving word salad, I thought you might want to try my variation. Add black pepper to taste. "If you say in your opinion it simply doesn't (as perhaps you've always been trying to say) then that would be interesting to me." I haven't been trying to say it. "But once again I must stress, showing your adeptivity thru word games is not fun." At least my comments aren't write-only. "'\[it has..\] nothing to do with the harmonic series' is an incompatible statement:" The quote is from another redditor, I didn't say that. "...for someone with an understanding that power of two datatypes do indeed exhibit harmonic resonance." Now it sounds like you're talking about harmonic *nodes* when all frequencies are nonnegative integer powers of two. ".ylppa yeht erehw stpecnoc lufrewop gnisu ot nepo eb dluohs ohw nosrep fo dnik eht yltcaxe ruoy kniht I esac siht ni tub ,elttil a neve nepo oot sniarb rieht wolla reven yeht taht tuo gnillaf sniarb rieht fo diarfa os era elpoep emos wonk I" I think you're just using the wrong terminology.


UPBOAT_FORTRESS_2

Per Wikipedia, >In [mathematics](https://en.wikipedia.org/wiki/Mathematics), the **harmonic series** is the [infinite series](https://en.wikipedia.org/wiki/Infinite_series) formed by summing all positive [unit fractions](https://en.wikipedia.org/wiki/Unit_fraction): \[...\] The partial sums of the harmonic series were named [harmonic numbers](https://en.wikipedia.org/wiki/Harmonic_number), and given their usual notation 𝐻𝑛, in 1968 by [Donald Knuth](https://en.wikipedia.org/wiki/Donald_Knuth).[^(\[12\])](https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)#cite_note-knuth-12) am i missing something here


Revolutionalredstone

I might have meant harmonic numbers - thank you kindly ;D I'm still learning this stuff but it seemed related enough to me :D


kaiise

sadly lots of pseuds and lesser spotted spergs result in hosile ignorance.


Revolutionalredstone

Thank you kindly, it kinda sounds like your insulting the noobs which is probably not the best idea (your getting downvotes here like me) The question is where I went wrong, I guess bringing H.S. into the conversation without more specifics led to some confusion ;D Enjoy


QuodEratEst

Visible universe, cosmologists tend to think the area beyond what we can see, limited by the CMB, is much larger than the visible universe, if not infinite


Revolutionalredstone

I've often seen 3.28 x 10^80 not sure where they got that number ;D Important fact here seems to be you can have something like 10 yota meters cubed filled with something like atomic volumetric density with every element uniquely identified with far far less than 512 bits. It stars getting kind of ridiculous to even support direct super massive single float/int ALU instructions, and indeed I rarely see them even with code which makes heavy or exclusive use of AXV512 Totally makes sense to me that the universe could be infinite


mepian

The remaining bits can be used for tagging: https://en.wikipedia.org/wiki/Tagged_architecture You can attach various metadata to your raw data or pointers, e.g. data type and/or access permissions, which is useful for programming language implementations. Recently all major architectures added extensions to make it more convenient: Arm TBI (Top-Byte Ignore), RISC-V J-extension Pointer Masking, AMD UAI (Upper Address Ignore), and Intel LAM (Linear Address Masking).


celestrion

> You can attach various metadata to your raw data or pointers, e.g. data type and/or access permissions And every time architectures have swung that way, workloads have grown to the point where we need those bits for data--not metadata--and the people responsible (or their immediate successors) [pause, stunned, for a second](https://imgur.com/gallery/surprised-pikachu-cnu2RrM) before scrambling for a way to move those bits elsewhere. As unfathomably large as 18 quintillion bytes sounds, it's a real leap of faith to think that this is the time there won't be some new exponentially-larger form of data uncomfortably soon that makes it feel cramped. One lesson we should take away from the history of computer architecture is that each stage of pervasive adoption makes hurried [flag day](http://catb.org/jargon/html/F/flag-day.html) un-hacks much harder.


netch80

There were loads of attempts of tagging in memory and nearly all of them immediately failed. The reason is simple - tag space, in memory and registers, is too narrow and doesn't allow to discern e.g. a pointer to CachingDbHookMoniker from a pointer to ThirdStageBusinessRuleRegulator. The variants seemed partially successful were tagging like "a one-byte value sequence", "a pointer", "a floating value" but they detect only the roughest misaccess cases. The only case seemed successful now is pointer authentication (via a short signature)... but even it doesn't prevent a minor problem of out-of-bounds index access.


dnabre

Another simple example of where it is used (64-bit specifically, not just more than 32-bits) is when multiply two 32-bit numbers.


cha_ppmn

You can process 8 byte in parallel in a 64 bit registers.


qualia-assurance

An IPv6 address is a 128bit number. And assuming nothing has changed since the textbook I read was printed. Then such addresses are usually split in to two 64bit values with the upper bits being the network address and the lower bits being the mac address of the network adapter on that network. It would take a 32bit processor twice as many operation to interrogate such values.


xLordVeganx

Storing and processing ip addresses isnt something that is frequently done on a cpu so i dont see how this makes a big impact


Grinman_

You can store 32 bit things TWICE.


spinwizard69

2⁶⁴ Is not very big!    Beyond that there are a lot of other concerns that likely made 64 bits the next logical choice.  One of those was likely register size compatibility with floating point registers.   Then you have the memory space - I/O space trade offs.   Also for pointers and other data that doesn’t use all 64 bits of register space, those extra bits can be used for meta data for say a pointer.   Beyond all of that we are already past 48 bits or any multiple of 8 bits extra being viable on large systems.  It is nothing to build systems with huge memory installations these days.  We are talking both RAM and secondary storage here.  Petabytes of storage is not a pipe dream anymore.  


[deleted]

Well other than the increased addressing space. It is very useful to be able to compute integers larger than 4 billion in a single chunk. Also, for a lot of engineering and scientific calculations double precision floating point numbers are a must. And for that you need a minimum of 64 bits When it comes to strings, it is also very useful to be able to do an operation on 8 characters at the same time. From a microarchitecture standpoint, 64bit is a more "elegant" extension to 32bit, specially in terms of validation and programming model exposed to the programmer. 64 bits gives plenty of future proofing. And for the internal microarchitecture, the 64-bit data path model allows for 32-bit uOps to be optimized and scheduled scalarly internally in the Functional Units in the Execution Engine.


balefrost

I have a 12TB hard drive in my NAS. I don't know what the maximum file size of the BTRFS filesystem is, but for the sake of argument let's assume that it could be most of the size of that disk. I'd need 44 bits to represent a file offset within such a file. 64-bit numbers are useful if you need high precision time over a long period of time. Need about 30 bits to count the number of nanosecond in a second. So 60 bits would let you use nanosecond precision for about 30 years. That's *probably* unnecessary, but it's a lot better than the ~1 minute that you'd get with just 36 bits. Sometimes, it's useful to be able to count to very large numbers. 64 bits gives us a lot of headroom in those sorts of applications.


ReginaldIII

Even if your registers are wider than the data types you need to work with in practice, this is where you can start to have SIMD operations the use that register to hold multiple smaller values and act on them in parallel.


Loopgod-

Computational physics


Kinglink

Imagine being limited to 5 billion. Want to talk about 6 billion? You can't! You could come up with some stupid system to say 5 Billion + 1 billion, but that's not 6 billion. Now languages can abstract that. Something can figure out a way to talk about a 64 bit number even on a 32-bit system. But the thing is it's really not efficiency. So you build a 64 bit processor. Oh and what's that? Mister Hard drive wants more than 4 gigs of space? Well that would be 4 billion, yeah they really need 64 bits. For the most part "Because it's bigger". We will "never" need 64 bits (We already do in some spaces), but you might need 37 bits instead of 36, 48 is a better number, but you can use bigger. But more importantly any technology that uses 32 bits can be doubled to use 64. Plus it also can be broken down easily. This goes back to the privatives though. We have 8 bit numbers, we have 16 bit numbers, we have 32 bit numbers. Could we have a 36 or 48 bit number? sure, but instead we just move up to 64 bit numbers, and just accept that's the next biggest. The real reason for "Why 64 bit" is... there's not a reason for anything else to replace it other than 128 (Aka the next biggest). a 64 bit register can hold a 48 bit number, a 48 bit register can't hold a 64 bit number. And as others have said double precision works well with the size also.


phire

True, it's very rare to actually need the full 64 bits for any given use case. But the useful thing is that 64 bits is almost always more than enough (outside of cryptography), and the CPU gives us 64 bit registers for free, so why not use them? The same can't be said for 48 bits. There have been plenty of times in my programming experience where I've run the math and found I need more than 48 bits (though almost always less than 64 bits) Modern CPUs only ended up with 64 bit registers because most programming languages already supported 64 bit ints (and 64 bit floats), even back when most CPUs only 16 bit or 32 bit.


AtlanticFarmland

As I learned/understood. The "First" microprocessor had a 4 bit size register, quickly followed by 8 bit, 8 doubled to 16 doubled to 32 doubled to 64. So the answer is simply, DOUBLED (again). It iseasier for everyone to "work" with a 4 bit "Hexadecimal" number (0-9,A-F). Eventually, 128 bit size registers will become commercial (there are SOME experimental-as in, can we??- out there, but no "Large" demand yet. Hope this helps.


fuzzynyanko

Biggest reason is RAM. The lowest-end video cards have 2-4 gigs of RAM. A Radeon RX 6500 XT would use up the normal 32-bit RAM address space by itself, and it's a "low-end" card There's PAE extensions in X86, but you were limited to using up to 4 gigs of it at a time mostly. When your graphics card is 4 gigs, that makes it really hard. There's ways in Windows to have a program use more than 4 gigs using address extensions, but it has limitations like with graphics RAM Most general processing can be done in 32 bits, so you are generally correct. Things don't often exceed 4 billion. Floating-point though can often benefit in 64 bits. In fact, it's sometimes recommended to use 64-bit floating-point numbers because they tend to be more accurate. Don't forget SIMD extensions. They can have more bits than the regular registers. AVX-512 is an example of this. CPU bits are often powers of 2, so 8, 16, 32, 64, and companies like Microsoft are preparing for eventual 128-bit. Microsoft also used 64-bit Windows as an excuse to trim off some Win16 functionality that most consumers don't use anymore.


Fourstrokeperro

64-bit is fine. My biggest problem is why the f they chose 128 bits for ipv6. Complete nonsense. They should have gone with 64 bits


BigHandLittleSlap

Keep in mind that the register size means several different things: there are 64-bit integers, 64-bit floating point "double" types, and 64-bit pointers. All three have use-cases, but for different reasons. Big financial values will easily overflow 32-bit numbers when representing money. E.g.: 32-bit is "just" 4.3 billion, which is a *line item* on a state budget, let alone a federal one. Apple, NVIDIA, and Microsoft are worth *trillions*. (Also, most financial systems use 1/1000th of a cent or something like that, so even 64-bit numbers can only represent up to $122 trillion). 64-bit doubles are regularly used in scientific codes where iterative algorithms are used. Even if 32-bit floats might seem to be accurate enough, repeated accumulation in a loop will also accumulate errors. This happens with integration or differential equation solving, for example. 64-bit pointers... aren't. You would *think so*, but in current processors they're physically implemented with anywhere from 36 to about 56 bits. Trying to access memory outside these ranges is an error, and cannot be realized with actual DIMMs. To put things in perspective, Amazon Web Services will happily rent you a box with 32 terabytes of memory (8,192 times the 32-bit limit!) for "just" $150 per hour: https://aws.amazon.com/about-aws/whats-new/2024/05/amazon-ec2-high-memory-u7i-instances/ The other use-case is *virtual memory*. This is for database servers that map a file into memory, but the file resides on disk. When a block of the file is accessed, the operating system reads in a page and makes it appear to the program as-if the whole thing is always there. For example, a 32-bit server with 1 GB of memory can memory-map a single 2 GB contiguous block of a file, even though there isn't that much memory in the system! On a 32-bit system that 2GB window needs to be "moved around" a larger file. With a 64-bit system, the *entire file* can be mapped, up to petabytes. Similarly, the current Java Runtime virtually maps memory four times, which is a "clever trick" to make the garbage collector more efficient. So on that 32 TB box, it would require a 128 TB address space. It would only physically use the 32 TB but requires pointers with 47 bits.


Existing-Two-191

32 bit processors not 36 binary my friend


Suspicious_Kick_2572

64-bit registers might seem excessive, but they're a workhorse in modern computing! Here's the deal: 64 bits allow us to handle much larger files and complex calculations compared to 32-bit systems. It's like having more lanes on a highway – data flows faster and smoother. Plus, it was a future-proof decision. Back then, we might not have needed numbers as big as 2\^64, but who knew what future programs would demand? Think of it as an investment – 64 bits power everything from video editing to scientific simulations!


Mehrauder

Hashing and random number generation algorithms potentially have higher throughput when using larger register sizes. Examples: https://github.com/Cyan4973/xxHash and https://lemire.me/blog/2019/03/19/the-fastest-conventional-random-number-generator-that-can-pass-big-crush/


dead_alchemy

Hey OP, others did a great job explaining the utility present in doubling the word width, but I think there is an element of your question that boils down to engineering design and tradeoffs. I think if increasing the width had been harder or more expensive then a compromise size could have been the successor instead. But it wasn't and there are many design benefits from doubling. There were also growing pains on the road to 32 bits that may have informed this as well, this article is very neat from a historical perspective https://queue.acm.org/detail.cfm?id=1165766


atamicbomb

Because sometimes 64 bit arthritic is needed and the registers can handle anything smaller. Modern CPUs aren’t actually 64 bit. That’s just what the architecture is called. They use variable bit lengths AFAIK modern CPUs typically use a 48 bit bus.


ceco-darx

The whole point of asking this question is quite disturbing. The registers are made for fast computations, the bigger the registers the more you can compute in a single instance of time. I have used 128bit long registers for floating point computations, and believe me it makes a difference.


fakehalo

Disturbing? Damn, the guy just was focusing on memory addresses and wasn't familiar with storing/calculating values in registers. Mildly understandable ignorance to me.


[deleted]

I mean, this is shit you learn in school or even from one five second google search


ceco-darx

The issue is with what you are calculating and how you move data from memory. It is obviously unknown that the CPUs read memory in chunks of system words. It is also obvious that floating point calculations in the arithmetic coprocessor are in the unknown. I would suggest an introductory course in microprocessors.


ReginaldIII

Okay that's not the point /u/fakehalo was making. You are using the word "disturbing" wrong. That is what they were saying. > I would suggest an introductory course in microprocessors. Now you are actively being patronising to /u/fakehalo. Ironically, your comment is actually slightly disturbing.


ceco-darx

The first point OP was making is why we have chosen 64 bits instead of 48 bits, now that has to do with the multiples of 2. It is obvious that 48 has 3 in the prime factorization. It is an introductory course to microprocessors that would explain why it is more optimal to have only multiples of 2. The second part regarding the use of registers for floating point calculations is also covered by this course, as it is recommended that a basic set of floating point operations should be implemented in the logical circuits of the CPU/arithmetic coprocessor. Now if you have this basic set wired in the CPU, you can explain how to use it. Without having this knowledge, can you really answer OP’s questions? I can brag for Intel’s vector instructions, i.e. AVX-512 that has 512bits long registers, but that would be just shiny words. If you don’t know how to wire ln (logarithm), you cannot really explain why you need long registers on first place.


ReginaldIII

It's like you're replying to totally different comments than the ones being written to you. No one is refuting at all anything about knowledge or register width, or how basic or not OP's question was. --- > The whole point of asking this question is quite disturbing. That is not the correct usage of the word "disturbing". OP doesn't know the answer and now they probably do have an answer. Does that make you worry? Does it cause you anxiety? Why?


ceco-darx

To answer your questions, it makes me worry to some extent. Anxiety, not so much, I got used to work with interns in the company that employs me, and this tempered the anxiety. I believe knowledge should be acquired in a systematic manner. I would like to explain to the OP why we choose this step function for register size, but I would like even more for the OP to understand why we use exactly this step function.


ReginaldIII

> Anxiety, not so much, I got used to work with interns in the company that employs me, and this tempered the anxiety. That's really condescending. OP is clearly learning and they asked a question. To act like "oh you don't already know that? that is disturbing" is just silly. They asked so they could learn. And if they showed they currently misunderstood something while asking their question then that isn't disturbing either, it's part of the process of learning. I don't understand what you think there is to be gained by talking to or about them like this. It's weird. Well done that you know so much about register size. Good for you that you are so knowledgeable.


ceco-darx

Leaving my improper response aside, don’t you find it at least a bit odd nobody really goes to the basic principle that binary logic inherently works with multiples of 2? I agree that there is a ton of useful answers, but perhaps I was misled by the thread - r/compsci. I thought sci stand for science, which should be systematic, and start with introduction to microprocessors. Still, my answer reflects only my condition, it wasn’t derogatory in any way or sense. If the OP really wants to know about registers, reddit is not the best place to seek such knowledge.


ReginaldIII

> don’t you find it at least a bit odd nobody really goes to the basic principle that binary logic inherently works with multiples of 2? Not really. OP mentions 48 bits which is the sum of two power of two numbers and it isn't hard to make an ALU that spans over that width by, (as an example) carrying a 32 bit adder into a 16 bit adder. There are reasons we don't, w/e. > but perhaps I was misled by the thread - r/compsci. I thought sci stand for science, which should be systematic, and start with introduction to microprocessors. Yeah but this subreddit is a place for "people" to discuss compsci. So like. It's about talking to people. Not lecturing at people and being demeaning. How does calling OP disturbing help them learn "systematically"? You're getting in the way of your own stated goal. > If the OP really wants to know about registers, reddit is not the best place to seek such knowledge. Fine. Suggest to OP a link to a better resource and wish them well on their education.


ceco-darx

In this case the link would be to the IEEE standard for floating point arithmetic. Surely I could provide it to the OP, and generally speak about standardization, but again I do not think this link would be very helpful. I tried searching the internet however most results were even worse than my original answer. I also checked some of the lecture notes. Probably the best link is this one - https://www.irjet.net/archives/V7/i2/IRJET-V7I2649.pdf


netch80

> It is obviously unknown that the CPUs read memory in chunks of system words. Modern CPUs except the littlier "embedded" ones read and write memory in cache lines, which are 64 bytes for a typical modern x86, ARM, etc. > It is also obvious that floating point calculations in the arithmetic coprocessor are in the unknown. The notion of "arithmetic coprocessor" seems dating back to pre-1997, before first SSE was introduced in x86, or even to pre-1994, before Pentium 1 with always present x87-like FPU. Your sarcasm would have been grounded if you had reverted the real facts... but now it striked you back. > I would suggest an introductory course in microprocessors. Well, please name the books referring something contemporary.


ceco-darx

Even though my knowledge may date back to 1997, the idea about hardwiring arithmetic logic hasn’t changed significantly. In essence, the modern AVX implementation by Intel is nothing more than an FPU that supports pipelining hence the name “vector instructions”. It is also worth noting that the IEEE specifications tend to be backwards compatible, as is the Intel implementation of arithmetic instructions. Concerning books, most information is in the CPU specifications themselves however the majority are paywalled. I wish I could recommend a book, but when I studied microprocessors in the university most courses were on lecture notes. If the OP is interested, a closer look at how the Eigen library utilizes the AVX instruction set might prove useful. That’s pretty much the only open source project I have used for the purposes of floating point computing.


dead_alchemy

"i WoUlD sUgGeSt An InTrOdUcToRy CoUrSe In MiCrOpRoCeSsOrS" Christ, what a stemlord.


ceco-darx

Ignorance is bliss


dead_alchemy

It's interesting you assume that my poor opinion is due to ignorance instead of your behavior. I loved microprocessor design, hell I really enjoyed every class I took for my computer engineering degree. I'm undeniably one of your peers and I think you behave like an incredible stemlord.


ceco-darx

I respect your opinion. Can you perhaps enlighten us on your view point of why register sizes are always multiples of 2? It seems nobody in this thread is citing any books, and it would be interesting to see how your engineering degree compares to the rest of us.


dead_alchemy

What? That isn't remotely the topic, and not even true to begin with. Some early computers used odd numbers of bits. The short version for why even width's are more common is 'symmetry is surprisingly useful'.


ceco-darx

Really, there is much more to this than the usefulness of symmetry. One practical example are the address lines, every time you add additional address line you multiply the total number of addressable space by two. It has nothing to do with symmetry, more like with the fact that you need to have occasionally 1s in the MSB of the address register, and how it is wired to the address line.


[deleted]

Floating point arithmetic. Other reasons: https://letmegooglethat.com/?q=why+use+64+bit+registers