And you most definitely expect a yield less than 100%, and then of the parts packaged and delivered you expect some percentage of those to fail in the field. You follow the decades worth of experience in design, layout, design verification, testing, etc. You start with experience and there are not that many foundries anyway and they know what they are doing. With COTS you don't worry about a single transistor failing it is all about averages/statistics. Two failures at once are not normally solvable (drop in the ocean). Single point of failure along with MTBF are very important the MTBF should be longer than the mission life, and no single point of failure should be able to go without detection and resolution. A single transistor failing is not a primary concern, single event upset, and latch-up are (which could/would lead to destruction if not handled) total dose, etc. Decades of experience (all lost to the young-uns who grew up in the disposable age, make space trash instead of make one that always works). You also need much beefier transistors to deal with single event upset, and you also have more other material to deal with total dose to extend the life of the part before the whole thing dies from exposure. But you don't triple vote everything combinational logic isn't if it gets a hit it should settle out before latching into the next flip flop/bit. You could argue that you get the what-if-one-fails argument. That is just how it is.Īerospace, radiation-hard in particular, flip flops are or were when you still had folks around who know how to build reliable stuff (they have mostly retired now) are triple voted, three sets of transistors per, if one bit fails or experiences a single event upset (which is the primary reason for the triple voting) then the other two dominate the vote. Depending on the product there is expected to be one in some number of thousands or tens of thousands that fail down the road. If the floating point unit failed during chip manufacturing test, you blow the fuse(s) to turn it from a potential DX into an SX and sell it that way.Ĭhip screen looks for and, with experience, covers more than say 99.99% of failures. Likewise some products are derived from other products, to go back in the day a 80486SX vs 80486DX. Now when making the parts things like RAM that are higher risk might have alternate blocks on the die that can be fused in using the BIST or other screen tests. But if the failed transistor is as you suggest in the processor core, then you have a brick, again drop in the ocean, not possible to make a POST that covers the possibilities and reports on them properly, not worth the effort of even talking about it. It is not worth talking about generally, even in aerospace.įor COTS systems, there is nothing whatsoever to protect you you might have software that covers the dense or risky things like RAM, a POST memory test kind of thing. It is not possible to expand here certainly since we don't know what part you are using and how large the almost unmeasurable number of possible outcomes are a drop in the ocean doesn't begin to cover the number of hypotheticals. Or it could be in a place that you use and the failure can result in any number of possible problems that may appear to manifest themselves in many ways. It could be in a section you use, but the fail condition could still appear to work only ever needed a zero there, it failed to a zero and that is fine, I didn't notice it until perhaps a firmware upgrade that used that transistor differently. The transistor could be in a section you don't use so you will never know.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |