Cache memory - technewscircle

Compositionstructure

Thehigh-speedbuffermemoryisthefirst-levelmemoryexistingbetweenthemainmemoryandtheCPU.Itiscomposedofstaticmemorychips(SRAM).Thecapacityisrelativelysmallbutthespeedishigherthanthatofthemainmemory.Much,closetothespeedoftheCPU.

Itismainlycomposedofthreeparts:

Cachememorybody:storestheinstructionsanddatablockstransferredfromthemainmemory.

Addressconversioncomponent:Createadirectorytabletorealizetheconversionfrommainmemoryaddresstocacheaddress.

Replacecomponents:Whenthecacheisfull,replacethedatablockaccordingtoacertainstrategyandmodifytheaddresstranslationcomponent.

Workingprinciple

Cachememoryisusuallycomposedofhigh-speedmemory,Lenovomemory,replacementlogiccircuitandcorrespondingcontrolcircuit.Inacomputersystemwithacachememory,theaddressofthecentralprocessortoaccessthemainmemoryisdividedintothreefields:rownumber,columnnumber,andaddressinthegroup.Therefore,themainmemoryislogicallydividedintoseveralrows;eachrowisdividedintoseveralmemorycellgroups;eachgroupcontainsseveralordozensofwords.High-speedmemoryisalsodividedintorowsandcolumnsofmemorycellgroupsaccordingly.Bothhavethesamenumberofcolumnsandthesamegroupsize,butthenumberofrowsinthehigh-speedmemoryismuchsmallerthanthatinthemainmemory.

Lenovomemoryisusedforaddressassociation.Ithasstorageunitswiththesamenumberofrowsandcolumnsashigh-speedmemory.Whenamemorycellgroupinacertaincolumnandrowofthemainmemoryistransferredtoanemptymemorycellgroupinthesamecolumnofthehigh-speedmemory,thememorycellcorrespondingtotheLenovomemoryrecordstherownumberofthetransferredmemorycellgroupinthemainmemory.

Whenthecentralprocessingunitaccessesthemainmemory,thehardwarefirstautomaticallydecodesthecolumnnumberfieldoftheaccessaddress,soastocomparealltherownumbersoftheLenovomemorywiththerowofthemainmemoryaddress.Thenumberfieldiscompared:iftherearethesame,itindicatesthatthemainmemoryunittobeaccessedisalreadyinthehigh-speedmemory,whichiscalledahit.Thehardwaremapstheaddressofthemainmemorytotheaddressofthehigh-speedmemoryandexecutestheaccessoperation;Iftheyarenotthesame,itmeansthattheunitisnotinthehigh-speedmemory,whichiscalledoff-target.Thehardwarewillperformtheoperationofaccessingthemainmemoryandautomaticallytransferthemainmemorycellgroupwheretheunitislocatedintotheemptymemorycellgroupinthesamecolumnofthehigh-speedmemory.Atthesametime,therownumberofthegroupinthemainmemoryisstoredintheunitofthecorrespondinglocationoftheLenovomemory.

Whenanoff-targetoccursandthereisnoemptypositioninthecorrespondingcolumnofthehigh-speedmemory,acertaingroupinthecolumniseliminatedtomakeroomforthenewlytransferredgroup,whichiscalledreplacement.Therulesfordeterminingreplacementarecalledreplacementalgorithms.Commonlyusedreplacementalgorithmsinclude:leastrecentlyusedalgorithm(LRU),first-infirst-out(FIFO),randommethod(RAND),andsoon.Thereplacementlogiccircuitperformsthisfunction.Inaddition,whenperformingawriteoperationtothemainmemory,inordertomaintaintheconsistencyofthecontentsofthemainmemoryandthehigh-speedmemory,hitsandmissesmustbehandledseparately.

Storagehierarchy

Themain-auxiliarystoragestoragehierarchyisduetothemainmemorycapacityofthecomputerrelativetothecapacityrequiredbytheprogrammerItisalwaystoosmall.Thetransferofprogramsanddatafromtheauxiliarystoragetothemainmemoryisarrangedbytheprogrammer.Theprogrammermustspendalotofenergyandtimetodividethelargeprogramintoblocksinadvance,anddeterminetheblocksintheauxiliarystorage.Thelocationandtheaddressloadedintothemainmemory,andhowandwheneachblockiscalledinandoutwhentheprogramisrunning.Therefore,thereisaproblemofmemoryspaceallocation.Theformationanddevelopmentoftheoperatingsystemallowsprogrammerstogetridoftheaddresspositioningbetweenthemainandauxiliarymemoryasmuchaspossible,andatthesametimeformsthe"auxiliaryhardware"thatsupportsthesefunctions.Throughthecombinationofsoftwareandhardware,themainmemoryandauxiliarymemoryareunifiedintoAwhole,asshowninthepicture.Atthistime,themainstorageandauxiliarystorageformastoragehierarchy,thatis,thestoragesystem.Onthewhole,itsspeedisclosetothespeedofmainstorage,itscapacityisclosetothecapacityofauxiliarystorage,andtheaveragepriceperpersonisalsoclosetotheaveragepriceofcheapandslowauxiliarystorage.Thecontinuousdevelopmentandimprovementofthissystemhasgraduallyformedthenowwidelyusedvirtualstoragesystem.Inthesystem,theapplicationprogrammercanaddresstheentireprogramuniformlywiththemachineinstructionaddresscode,justastheprogrammerhasallthevirtualmemoryspacecorrespondingtothewidthoftheaddresscode.Thisspacecanbemuchlargerthantheactualspaceofthemainmemory,sothattheentireprogramcanbesaved.Thisinstructionaddresscodeiscalledvirtualaddress(virtualmemoryaddress,virtualaddress)orlogicaladdress,anditscorrespondingstoragecapacityiscalledvirtualmemorycapacityorvirtualmemoryspace;andtheaddressofactualmainmemoryiscalledphysicaladdress,real(Storage)address,anditscorrespondingstoragecapacityiscalledmainstoragecapacity,realstoragecapacityorreal(main)storagespace

main-auxiliarystoragestoragehierarchy

CACHE-mainstoragestorageHierarchy

Whenthevirtualaddressisusedtoaccessthemainmemory,themachineautomaticallyconvertsittotherealaddressofthemainmemorythroughauxiliarysoftwareandhardware.Checkwhetherthecontentoftheunitcorrespondingtothisaddresshasbeenloadedintothemainmemory.Ifitisinthemainmemory,accessit.Ifitisnotinthemainmemory,transfertheprogramanddatawhereitisfromtheauxiliarymemorytothemainmemoryviatheauxiliarysoftwareandhardware.,Andthenvisit.Theseoperationsdonothavetobearrangedbytheprogrammer,thatistosay,itistransparenttotheapplicationprogrammer.Themain-auxiliarystoragelevelsolvesthecontradictionbetweenthelarge-capacityrequirementsofthememoryandthelow-cost.Intermsofspeed,thecomputer'smainmemoryandCPUhavekeptagapofaboutanorderofmagnitude.ObviouslythisgaplimitsthepotentialoftheCPUspeed.Inordertobridgethisgap,asinglememoryusingonlyoneprocessisnotfeasible,andfurtherresearchmustbedonefromthestructureandorganizationofthecomputersystem.Settingupahigh-speedbuffermemory(Cache)isanimportantmethodtosolvetheaccessspeed.Ahigh-speedcachememoryissetbetweentheCPUandthemainmemorytoformacache(Cache)-mainmemorylevel,andtheCacheisrequiredtobeabletokeepupwiththerequirementsoftheCPUintermsofspeed.TheaddressmappingandschedulingbetweenCache-mainmemoryabsorbsthetechnologyofthemain-auxiliarymemorystoragelayerthatappearedearlierthanit.Thedifferenceisthatbecauseofitshighspeedrequirements,itisnotrealizedbythecombinationofsoftwareandhardwarebutbyhardware.asthepictureshows.

Addressmappingandconversion

Addressmappingreferstothecorrespondencebetweentheaddressofacertaindatainthememoryandtheaddressinthebuffer.Thefollowingdescribesthreeaddressmappingmethods.

1.Fullyassociativemethod

Addressmappingrule:AnypieceofmainmemorycanbemappedtoanypieceofCache

(1)MainStorageandcachearedividedintodatablocksofthesamesize.

(2)Acertaindatablockofthemainmemorycanbeloadedintoanyspaceofthecache.IfthenumberofCacheblocksisCbandthenumberofmainmemoryblocksisMb,thenthereareCb×Mbkindsofmappingrelationships.

Thedirectorytableisstoredintherelevant(associated)memory,whichincludesthreeparts:theblockaddressofthedatablockinthemainmemory,theblockaddressafterbeingstoredinthecache,andtheeffectivebit(alsocalledtheloadbit).Sinceitisafullyassociativemethod,thecapacityofthedirectorytableshouldbethesameasthenumberofblocksinthecache.

Advantages:Thehitrateisrelativelyhigh,andtheutilizationofCachestoragespaceishigh.

Disadvantages:Whenaccessingtherelatedmemory,itmustbecomparedwiththeentirecontenteverytime,thespeedislow,thecostishigh,andthereforetherearefewapplications.

2.Directassociationmethod

Addressmappingrule:AblockinthemainmemorycanonlybemappedtoaspecificblockintheCache.

(1)Themainmemoryandcachearedividedintodatablocksofthesamesize.

Cache memory

(2)Themainmemorycapacityshouldbeanintegermultipleofthecachecapacity.Themainmemoryspaceisdividedintoareasaccordingtothecachecapacity,andthenumberofblocksineachareaofthemainmemoryisequaltothetotalnumberofblocksinthecache.

(3)Whenablockofacertainareainthemainmemoryisstoredinthecache,itcanonlybestoredinthesamelocationinthecachewiththesameblocknumber.

Datablockswiththesameblocknumberineachareaofthemainmemorycanbetransferredtotheaddresswiththesameblocknumberinthecache,butonlyoneblockcanbestoredinthecacheatthesametime.Sincethemainandcacheblocknumbersarethesame,onlytheareacodeofthetransferredblockcanberecordedduringdirectoryregistration.Thetwofieldsofthemainandcacheblocknumbersandtheaddressintheblockareexactlythesame.Thedirectorytableisstoredinhigh-speedandsmall-capacitymemory,whichincludestwoparts:theareanumberofthedatablockinthemainmemoryandtheeffectivebit.Thecapacityofthedirectorytableisthesameasthenumberofcachedblocks.

Advantages:Theaddressmappingmethodissimple.Whenaccessingdata,youonlyneedtocheckwhethertheareacodeisequal,soyoucangetafasteraccessspeed,andthehardwaredeviceissimple.

Disadvantages:replacementoperationsarefrequentandthehitrateisrelativelylow.

3.Group-associativemappingmethod

Thegroup-associativemappingrule:

(1)MainmemoryandCachearedividedintoblocksofthesamesize.

(2)MainmemoryandCachearedividedintogroupsofthesamesize.

(3)Themainmemorycapacityisanintegermultipleofthecachecapacity.Themainmemoryspaceisdividedintoareasaccordingtothesizeofthebufferarea.Thenumberofgroupsineachareaofthemainmemoryisthesameasthenumberofgroupsinthecache.

(4)Whenthedatainthemainmemoryisloadedintothecache,thegroupnumbersofthemainmemoryandthecacheshouldbeequal,thatis,ablockineachareacanonlybestoredinthespaceofthesamegroupnumberinthecache,butEachblockaddressinthegroupcanbestoredarbitrarily,thatis,thedirectmappingmethodisadoptedfromthemainmemorygrouptotheCachegroup;thefullassociativemappingmethodisadoptedwithinthetwocorrespondinggroups.

Theconversionbetweenthemainmemoryaddressandthecacheaddresshastwoparts.Thegroupaddressisaccessedaccordingtothedirectmappingmethod,andtheblockaddressisaccessedaccordingtothecontent.Thegroup-associatedaddressconversionunitisalsoimplementedbyusingrelatedmemories.

Advantages:Thecollisionprobabilityoftheblockisrelativelylow,theutilizationrateoftheblockisgreatlyimproved,andtheblockfailurerateissignificantlyreduced.

Disadvantages:Thedifficultyandcostofimplementationarehigherthanthatofdirectmapping.

Replacementstrategy

1.Accordingtothelawofprogramlocality,itcanbeknownthattheprogramalwaysusesthoseinstructionsanddatathathavebeenusedrecentlywhenitisrunning.Thisprovidesatheoreticalbasisforthereplacementstrategy.Basedonvariousfactorssuchashitrate,difficultyofrealizationandspeed,thereplacementstrategycanincluderandommethod,first-infirst-outmethod,leastrecentlyusedmethod,etc.

(1).Randommethod(RANDmethod)

Therandommethodistorandomlydeterminethereplacementmemoryblock.Setuparandomnumbergenerator,anddeterminethereplacementblockbasedontherandomnumbergenerated.Thismethodissimpleandeasytoimplement,butthehitrateisrelativelylow.

(2).First-in-first-outmethod(FIFOmethod)

Thefirst-in-first-outmethodistoselecttheblockthatiscalledfirstforreplacement.Whentheblockthatisfirsttransferredandhitmultipletimesislikelytobereplacedfirst,itdoesnotconformtothelocalitylaw.Thehitrateofthismethodisbetterthantherandommethod,butitdoesnotmeettherequirements.Thefirst-infirst-outmethodiseasytoimplement,

(3).Theleastrecentlyusedmethod(LRUmethod)

TheLRUmethodisbasedontheusageofeachblock,alwayschoosetheleastrecentlyusedmethodTheblockisreplaced.Thismethodbetterreflectsthelawofprogramlocality.TherearemanywaystoimplementtheLRUstrategy.

2Inamulti-bodyparallelstoragesystem,becausetheI/OdevicerequestsahigherlevelofmemoryfromthemainmemorythantheCPUfetches,thiscausesthephenomenonthattheCPUwaitsfortheI/Odevicetofetchthememory,causingtheCPUWaitingforaperiodoftime,orevenwaitingforafewmainmemorycycles,reducestheefficiencyoftheCPU.InordertoavoidtheCPUandtheI/Odevicefromcompetingformemoryaccess,afirst-levelcachecanbeaddedbetweentheCPUandthemainmemory.Inthisway,themainmemorycansendtheinformationthattheCPUneedstothecacheinadvance.OncethemainmemoryandtheI/OdeviceexistDuringtheexchange,theCPUcandirectlyreadtherequiredinformationfromthecache,withouthavingtowaittoaffectefficiency.

3Thealgorithmscurrentlyproposedcanbedividedintothefollowingthreecategories(thefirstcategoryisthekeytomaster):

(1)Traditionalreplacementalgorithmanditsdirectevolution,itsrepresentativealgorithmThereare:①LRU(LeastRecentlyUsed)algorithm:replacetheleastrecentlyusedcontentoutofCache;②LFU(LeaseFrequentlyUsed)algorithm:replacetheleastaccessedcontentoutofCache;③IfallcontentinCacheiscachedonthesamedayIfyes,replacethelargestdocumentoutofCache,otherwisereplaceitaccordingtotheLRUalgorithm.④FIFO(FirstInFirstOut):Followthefirst-in-first-outprinciple.IfthecurrentCacheisfull,replacetheonethatenteredtheCacheearliest.

(2)Thereplacementalgorithmbasedonthekeyfeaturesofthecachecontent,itsrepresentativealgorithmsare:①Sizereplacementalgorithm:replacethelargestcontentoutofCache②LRU-MINreplacementalgorithm:thisalgorithmstrivestomakethenumberofdocumentsreplacedleast.SupposethesizeofthedocumenttobecachedisS,andreplacethedocumentwithasizeofatleastSintheCacheaccordingtotheLRUalgorithm;ifthereisnoobjectwithasizeofatleastS,followtheLRUalgorithmfromthedocumentwithasizeofatleastS/2Replace;③LRU-Thresholdreplacementalgorithm:SameastheLRUalgorithm,exceptthatdocumentswhosesizeexceedsacertainthresholdcannotbecached;④LowestLacencyFirstreplacementalgorithm:ReplacethedocumentswiththeleastaccessdelayoutofCache.

(3)Cost-basedreplacementalgorithm,thistypeofalgorithmusesacostfunctiontoevaluatetheobjectsintheCache,andfinallydeterminesthereplacementobjectbasedonthevalueofthecost.Itsrepresentativealgorithmsare:①Hybridalgorithm:thealgorithmassignsautilityfunctiontoeachobjectintheCache,andreplacestheobjectwiththeleastutilityoutoftheCache;②LowestRelativeValuealgorithm:replacestheobjectwiththelowestutilityvalueoutoftheCache;③LeastNormalizedCostReplacement(LCNR)algorithm:Thisalgorithmusesaninferencefunctionaboutdocumentaccessfrequency,transmissiontime,andsizetodeterminethereplacementdocument;④Bolotetal.proposedaweightedinferencefunctionbasedondocumenttransmissiontimecost,size,andlastaccesstime.Determinethedocumentreplacement;⑤Size-AdjustLRU(SLRU)algorithm:sortthecachedobjectsaccordingtotheratioofcosttosize,andselecttheobjectwiththesmallestratioforreplacement.

Functionintroduction

Duringthedevelopmentofcomputertechnology,theaccessspeedofthemainmemoryhasalwaysbeenmuchslowerthantheoperationspeedofthecentralprocessingunit,sothatthehigh-speedprocessingcapacityofthecentralprocessingunitcannotbefullyutilized.,Theworkefficiencyoftheentirecomputersystemisaffected.Therearemanymethodstoalleviatethecontradictionbetweenthespeedmismatchbetweenthecentralprocessingunitandthemainmemory,suchastheuseofmultiplegeneral-purposeregisters,multi-bankinterleaving,etc.Theuseofcachememoryatthestoragelevelisalsooneofthecommonlyusedmethods.Manylargeandmedium-sizedcomputersaswellassomerecentminicomputersandmicrocomputersalsousehigh-speedbuffermemory.

Thecapacityofthecachememoryisgenerallyonlyafewpercentofthemainmemory,butitsaccessspeedcanmatchthecentralprocessingunit.Accordingtotheprincipleofprogramlocality,thereisahighprobabilitythatthoseunitsadjacenttoacertainunitofthemainmemorythatarebeingusedwillbeused.Therefore,whenthecentralprocessingunitaccessesacertainunitofthemainmemory,thecomputerhardwareautomaticallytransfersthecontentsofthegroupofunitsincludingtheunitintothecachememory,andthemainmemoryunitthatthecentralprocessingunitwillaccessisverylikelyJustinthegroupofcellsthatwerejusttransferredtothecachememory.Thus,thecentralprocessingunitcandirectlyaccessthecachememory.Intheentireprocessingprocess,ifmostoftheoperationsofthecentralprocessingunittoaccessthemainmemorycanbereplacedbyaccesstothecachememory,theprocessingspeedofthecomputersystemcanbesignificantlyimproved.

Readhitrate

WhentheCPUfindsusefuldataintheCache,itiscalledahit.WhenthereisnodataneededbytheCPUintheCache(thisiscalledamiss),TheCPUonlyaccessesthememory.Intheory,inaCPUwithlevel2Cache,thehitrateforreadingL1Cacheis80%.Thatistosay,theusefuldatafoundbytheCPUfromL1Cacheaccountsfor80%ofthetotaldata,andtheremaining20%isreadfromL2Cache.Duetotheinabilitytoaccuratelypredictthedatatobeexecuted,thehitrateofreadingL2isalsoabout80%(readingusefuldatafromL2accountsfor16%ofthetotaldata).Thentheremainingdatahastobecalledfrommemory,butthisisalreadyafairlysmallpercentage.Insomehigh-endCPUs,weoftenhearL3Cache,whichisdesignedfordatathatmissesafterreadingL2Cache—akindofCache.InaCPUwithL3Cache,onlyabout5%ofthedataneedstobecalledfrommemory.ThisfurtherimprovestheefficiencyoftheCPU.

InordertoensureahigherhitrateduringCPUaccess,thecontentintheCacheshouldbereplacedaccordingtoacertainalgorithm.Amorecommonlyusedalgorithmisthe"leastrecentlyusedalgorithm"(LRUalgorithm),whicheliminatesthelinesthathavebeenleastvisitedinthemostrecentperiodoftime.Therefore,itisnecessarytosetacounterforeachrow.TheLRUalgorithmclearsthecounterofthehitrowandadds1tothecountersoftheotherrows.Whenreplacementisneeded,thedatarowwiththelargestrowcountercountiseliminated.Thisisanefficientandscientificalgorithm.ItscounterclearingprocesscaneliminatesomedatathatisnolongerneededafterfrequentcallsoutoftheCacheandimprovetheutilizationoftheCache.

TheimpactoftheCachereplacementalgorithmonthehitrate.WhenanewmainmemoryblockneedstobetransferredintotheCacheanditsavailablespaceisfullagain,thedataintheCacheneedstobereplaced,whichcreatesareplacementstrategy(algorithm)problem.Accordingtothelawofprogramlocality,itcanbeknownthattheprogramalwaysusestheinstructionsanddatathathavebeenusedrecentlywhentheprogramisrunning.Thisprovidesatheoreticalbasisforthereplacementstrategy.ThegoalofthereplacementalgorithmistomaketheCachegetthehighesthitrate.Cachereplacementalgorithmisanimportantfactorthataffectstheperformanceoftheproxycachesystem.AgoodCachereplacementalgorithmcanproduceahigherhitrate.Thecommonlyusedalgorithmsareasfollows:(1)Randommethod(RANDmethod)Therandomreplacementalgorithmistousearandomnumbergeneratortogenerateablocknumbertobereplaced,andreplacetheblock.Thisalgorithmissimpleandeasytoimplement.Moreover,itdoesnotconsiderthepast,presentandfutureuseoftheCacheblock,butdoesnotusethe"historicalinformation"usedbytheuppermemory,anddoesnotfollowtheprincipleoflocalityofmemoryaccess,sothehitrateoftheCachecannotbeimproved,andthehitrateislow.

(2)First-in-first-outmethod(FIFOmethod)First-in-first-out(First-In-First-Out,FIFO)algorithm.ItistoreplacetheinformationblockthatfirstenterstheCache.TheFIFOalgorithmdeterminestheorderofeliminationaccordingtothesequenceoftransfersintotheCache,andselectstheearliestblocktransferredintotheCacheforreplacement.Itdoesnotneedtorecordtheusageofeachblock,whichisrelativelyeasytoimplementandhaslowsystemoverhead.ItsdisadvantageisthatsomeProgramblocksthatneedtobeusedfrequently(suchascyclicprograms)arealsoreplacedastheearliestblockstoentertheCache,andtheyarenotbasedontheprincipleoflocalityofmemoryaccess,sothehitrateoftheCachecannotbeimproved.Thisisbecausetheearliestinformationmaybeusedinthefuture,orfrequentlyused,suchasacycleprogram.Thismethodissimpleandconvenient.Itusesthe"historicalinformation"ofthemainmemory,butitcannotbesaidthatthefirstentryisnotoftenused.Itsdisadvantageisthatitcannotcorrectlyreflecttheprincipleofprogramlocality,thehitrateisnothigh,andanabnormalitymayoccur.Phenomenon.

(3)LeastRecentlyUsed(LRU)algorithm.ThismethodistoreplacetheleastrecentlyusedinformationblockintheCache.Thisalgorithmisbetterthanthefirst-infirst-outalgorithm.However,thismethoddoesnotguaranteethatitwillnotbeusedfrequentlyinthepastandwillnotbeusedinthefuture.TheLRUmethodalwaysselectstheleastrecentlyusedblocktobereplacedbasedontheusageofeachblock.Althoughthismethodbetterreflectsthelawofprogramlocality,thisreplacementmethodneedstorecordtheusageofeachblockintheCacheatanytime,inordertodeterminewhichblockistheleastrecentlyusedblock.TheLRUalgorithmisrelativelyreasonable,butitismorecomplicatedtoimplement,andthesystemoverheadisrelativelylarge.Itisusuallynecessarytosetupahardwareorsoftwaremodulecalledacounterforeachblocktorecorditsuse.