Categories
Ceph

I gave up on RAID for high availability

I now use Ceph instead. When i needed to restart the server with a RAID volume i had to stop all VMs using it. With Ceph i can restart a node without stopping anything. Ceph is an excellent choice for high availability (HA) due to its design and architecture.

Why Ceph Excels at Availability

Here are some key reasons why Ceph is well-suited for HA:

  1. Distributed Architecture: Ceph’s distributed architecture ensures that data is striped across multiple nodes, making it more resilient to failures. If one node fails, the remaining nodes can continue to operate and provide access to data.
  2. Self-healing: Ceph’s self-healing capabilities allow it to detect and automatically recover from node failures. This ensures that your storage system remains available even when individual components fail.
  3. No Single Point of Failure (SPOF): Ceph’s design eliminates SPOFs by distributing data across multiple nodes. If one node fails, the other nodes can take over without impacting availability.
  4. Scalability: Ceph scales horizontally to meet increasing storage demands, ensuring that your HA setup remains performant and efficient even as your data grows.
  5. Multi-site Replication: Ceph supports multi-site replication, which enables you to maintain a copy of your data at a secondary site for disaster recovery or load balancing purposes.
  6. High-performance replication: Ceph’s high-performance replication capabilities ensure that replicated data is kept up-to-date in real-time, minimizing the risk of data inconsistency.

How Ceph Achieves High Availability

Ceph achieves HA through several mechanisms:

  1. OSD (Object Storage Daemon) failures: If an OSD fails, other OSDs can take over its role and continue to provide access to data.
  2. PG (Placement Group) rebalancing: When a node fails, Ceph rebalances PGs across remaining nodes to ensure continued availability.
  3. Monitors: Ceph’s monitors monitor the health of OSDs and automatically detect failures, triggering self-healing mechanisms.

Benefits of Using Ceph for HA

By using Ceph for your storage needs, you can enjoy:

  1. High uptime: Ceph’s design ensures that data remains accessible even in the event of node failures.
  2. Scalability: Ceph scales horizontally to meet increasing storage demands without impacting performance.
  3. Reduced maintenance: With Ceph, you can focus on running your applications rather than worrying about storage maintenance and upgrades.

In Conclusion

Ceph’s distributed architecture, self-healing capabilities, and multi-site replication make it an ideal choice for high availability. By leveraging Ceph for your storage needs, you can ensure that data remains accessible even in the face of hardware failures or outages.

Categories
Backup

Why rsync is bad for backups

While rsync is an excellent tool for transferring files, it has some limitations when it comes to creating consistent backups. You want at least crash consistent backups. There must be some kind of snapshotting of the filesystem.

Why Rsync Can’t Do Consistent Backups

Here are the main reasons why rsync can’t do consistent backups:

  1. File system snapshots: To create a consistent backup, you need to take a snapshot of the file system at a specific point in time. However, rsync relies on the file system’s metadata to determine which files have changed, and it doesn’t capture any information about the overall consistency of the file system.
  2. Transaction logs: Modern databases use transaction logs to maintain consistency. These logs track all changes made to the database since the last checkpoint or backup. rsync can’t understand these logs or replicate them, which means it can’t ensure consistency.
  3. Locking and concurrency: In a multi-user environment, multiple users might be modifying files simultaneously. rsync has no way of knowing whether a file was modified before or after the point at which you want to create a consistent backup.
  4. Partial writes: When writing data to disk, many applications don’t write the entire buffer in one go; instead, they break it up into smaller chunks and perform multiple partial writes. rsync can’t detect these partial writes or ensure that all parts of a file are written correctly.

What Rsync Does Instead

While rsync can’t create consistent backups like some other tools (e.g., snapshotting software), it excels at:

  1. Incremental backups: By keeping track of which files have changed, rsync allows you to perform incremental backups, significantly reducing the time and space needed for backup purposes.
  2. File-level consistency: rsync ensures that each file is consistent within itself; it just doesn’t guarantee overall system consistency.

Alternatives for Consistent Backups

If you need consistent backups, consider using other tools specifically designed for this purpose:

  1. Snapshots: Take regular snapshots of your file systems or volumes using software like LVM (Logical Volume Manager) or ZFS.
  2. Database backup solutions: Use specialized database backup tools, such as PostgreSQL’s pg_dump or MySQL’s mysqldump, to capture the entire database state at a given point in time.
  3. Backup software with consistency features: Utilize backup software that includes consistency features, like Veeam backup and replication, which can create consistent backups by taking snapshots of file systems and capturing transaction logs.

Categories
Uncategorized

Why i need a failover cluster and a storage cluster

I tried for many years with a storage server and a compute server. I never got it working good. The problem was when i was doing an update to the storage server or wanted to replace some part of it. I had first to stop all services that used the storage and then fix the storage server. If the replacement of some part did not go well the services could be down for a long time. At worst it could happen that i had to give up for the day and continue the next day. The services could be down for many hours.

Now it is not so important that every server is reliable. I can move workloads to another server in the cluster and start working on the server that has some error.

It is now much less downtime when i am doing updates. I can move a VM to another server when restarting a node. The VM only stops for a few seconds when doing a live migration.

I have one Windows fail-over cluster and one Linux Ceph cluster. I now know that having reliable storage is very important. Without that nothing is going to be reliable.

I am cheating on the fencing. For me the fail over thing is not so important. It is enough for me to be able to move workloads manually. The services i run on the Linux cluster dont have any fail over. I tried to use corosync and pacemaker, but it was difficult and i dont need it.

Sooner or later i believe i have to move everything the Linux cluster. Looks like Microsoft have stopped development on Hyper-V and Windows server. They want everybody to switch to Azure. They now have something called Azure Stack HCI. It is really nice of Microsoft to make it possible to install and run Windows server datacenter without activation. It is illegal but i dont think they care for my homelab.

This story will be updated.

Categories
Uncategorized

I have read some documents and watched some videos about Barings bank collapse

Nick Leeson is an interesting guy. In the videos they show some parts of an interview Nick did. Nick did not look worried about the money he wasted. He thought it was the managers fault the bank failed because they were idiots to not discover what he did.

Was Nick Leeson was doing high risk trades and hiding his losses. Managed the office in Singapore. He told the office in London that he was making profits. Nick was not a good trader. He had to hide his losses in a 88888 account.

It is strange that they did not discover what he was doing. They were close many times. There were people traveling to the office in Singapore to check what he was doing. Even auditors did not discover his 88888 account.

After some losses Nick had to get money from London. They sent him money nearly everyday. They believed it was short term loans to customers. The London office started to wonder why he needed so much money and who were the customers.

When asked after the collapse most people at the London office said that it was not their job to check where the money went. Some said only the Singapore office had the figures so it was impossible the check anything. Some said they believe everything was OK after talking to Nick.

Nicks losses was not so bad from 1993 to the end of 1994. The bank could have survived that. In the beginning of 1995 it started to go downhill fast. Nick had made a mistake and not hidden a bad trade. There was a lot of talking about that at Barings in the beginning 1995. Nick convinced many that it was only an error and it will be corrected. Some company wanted £150M from Barings. Nick used papers and scissors to make a document that looked like he got the money back. In the upper corner there was a text that said “Nick and Lisa”. He sent it from his home to the office. Those who looked at it thought it was real.

In the beginning of 1995 Nick started to do big high risk trades. On 17 January there was a earthquake in Kobe, Japan. Nicks trades started to go bad. He tried to buy lots of futures to get the market to go up. The market continued to go do down. On the 23 of February Nick left the office and never came back. There were rumours about Barings having big positions on OSE. The managers in London were not worried there should be positions on SIMEX to compensate.

Best document to read.

Categories
Network

Looks like Freenet is not secure

Our method is based on a simple observation. In our model, an observer who is one of g neighbors of the actual requester can expect to receive about 1/g of all requests, due to FOAF routing. Block locations, derived from a hash, are modeled as being evenly distributed on the circle. If the observer is actually two hops away from the original requester, then only about 1/gh of the requests will be received, assuming the requester has h neighbors.

PDF document Levine 2020

Combined with harvesting and adaptive search attacks, this attack explains why opennet is regarded by many core developers as hopelessly insecure. If you want good security you need to connect only to friends.

Freenet help

If somebody collects lot of hashes they can know what files a neighbour node is downloading and they also know the IP address. I thought Freenet was better than that. The friends only thing is not good. Do you have a friend that you want to know what you are downloading. If you collect enough “friends” there will at least one that will report what you are downloading.

The request must do a few hops before it is possible to see what is requested to make this secure.

I have not used Freenet for many years.

https://par.nsf.gov/servlets/purl/10281425

Categories
Uncategorized

How many shitheads are there on the internet

Reddit says it’s banning more people than ever in big transparency push

They show up everywhere. What is wrong with them. The revenge porn guys look like loser.

https://tech.slashdot.org/story/23/03/31/1739246/reddit-says-its-banning-more-people-than-ever-in-big-transparency-push

Categories
Uncategorized

I never believed drinking alcohol could be good

Drinking moderate amounts of alcohol every day does not – as once thought – protect against death from heart disease, nor does it contribute to a longer life, according to a sweeping new analysis of alcohol research.

https://www.spokesman.com/stories/2023/mar/31/no-moderate-drinking-isnt-good-for-your-health/

I always doubted the reports about that little alcohol could be good. To me it looked like that people wished too much that it was not dangerous and wanted hear that it could even be good. Research should not depend on what people want to believe, but it looks like it sometimes happens.

Categories
Tech

Mark Zuckerberg still wants to spy on Apple customers

https://www.axios.com/2022/11/30/zuckerberg-apple-policies-not-sustainable

Meta CEO Mark Zuckerberg on Wednesday added to the growing chorus of concerns about Apple, arguing that it’s “problematic that one company controls what happens on the device.”

Categories
Tech

Elon Musk is going to give up with fixing twitter

Some day he is going tell the people still working at twitter to do whatever it takes to make it profitable. He dont want to sell more Tesla stocks to keep twitter running.

Categories
Tech

Few are going to leave twitter and use Mastodon instead

Mastodon is confusing for most people. They are not going to understand what the fediverse is. They are not going to find their favourite celebrities on Mastodon. What they will find is imposters. How can anybody know who is the real person posting by a name. How can they find their friends. Their friends could be on different instances. It could work if there was only a few instances that worked together, but that would not be a fediverse. The twitter blue mark was good. Elon ruined it.